Merge multidimensional NumPy arrays based on first row - python

I have to work with sensor data (from ros, specifically, but it should not be relevant). To this end, I have several 2-D numpy arrays with one row storing the timestamps and the following others the corresponding sensors data. Problem is, such arrays do not have the same dimensions (different sampling times). I need to merge all of these arrays into a single big one. How can I do so based on the timestamp and, say, replace the missing numbers with 0 or NaN?
Example of my situation:
import numpy as np
time1=np.arange(1,10)
data1=np.random.randint(200, size=time1.shape)
a=np.array((time1,data1))
print(a)
time2=np.arange(1,10,2)
data2=np.random.randint(200, size=time2.shape)
b=np.array((time2,data2))
print(b)
Which returns output
[[ 1 2 3 4 5 6 7 8 9]
[ 51 9 117 174 164 60 95 197 30]]
[[ 1 3 5 7 9]
[ 35 188 114 153 36]]
What I am looking for is
[[ 1 2 3 4 5 6 7 8 9]
[ 51 9 117 174 164 60 95 197 30]
[ 35 0 188 0 114 0 153 0 36]]
Is there any way to achieve this in an efficient way? This is an example but I am working with thousands of samples. Thanks!

For simple case of one b-matrix
With first row of a storing all possible timestamps and both of those first rows in a and b being sorted, we can use np.searchsorted -
idx = np.searchsorted(a[0],b[0])
out_dtype = np.result_type((a.dtype,b.dtype))
b0 = np.zeros(a.shape[1],dtype=out_dtype)
b0[idx] = b[1]
out = np.vstack((a,b0))
For several b-matrices
Approach #1
To extend to multiple b-matrices, we can follow a similar method with np.searchsorted within a loop, like so -
def merge_arrays(a, B):
# a : Array with first row holding all possible timestamps
# B : list or tuple of all b-matrices
lens = np.array([len(i) for i in B])
L = (lens-1).sum() + len(a)
out_dtype = np.result_type(*[i.dtype for i in B])
out = np.zeros((L, a.shape[1]), dtype=out_dtype)
out[:len(a)] = a
s = len(a)
for b_i in B:
idx = np.searchsorted(a[0],b_i[0])
out[s:s+len(b_i)-1,idx] = b_i[1:]
s += len(b_i)-1
return out
Sample run -
In [175]: a
Out[175]:
array([[ 4, 11, 16, 22, 34, 56, 67, 87, 91, 99],
[ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]])
In [176]: b0
Out[176]:
array([[16, 22, 34, 56, 67, 91],
[20, 80, 69, 79, 47, 64],
[82, 88, 49, 29, 19, 19]])
In [177]: b1
Out[177]:
array([[ 4, 16, 34, 99],
[28, 34, 0, 0],
[36, 53, 5, 38],
[17, 79, 4, 42]])
In [178]: merge_arrays(a, [b0,b1])
Out[178]:
array([[ 4, 11, 16, 22, 34, 56, 67, 87, 91, 99],
[ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10],
[ 0, 0, 20, 80, 69, 79, 47, 0, 64, 0],
[ 0, 0, 82, 88, 49, 29, 19, 0, 19, 0],
[28, 0, 34, 0, 0, 0, 0, 0, 0, 0],
[36, 0, 53, 0, 5, 0, 0, 0, 0, 38],
[17, 0, 79, 0, 4, 0, 0, 0, 0, 42]])
Approach #2
If looping with np.searchsorted seems to be the bottleneck, we can vectorize that part -
def merge_arrays_v2(a, B):
# a : Array with first row holding all possible timestamps
# B : list or tuple of all b-matrices
lens = np.array([len(i) for i in B])
L = (lens-1).sum() + len(a)
out_dtype = np.result_type(*[i.dtype for i in B])
out = np.zeros((L, a.shape[1]), dtype=out_dtype)
out[:len(a)] = a
s = len(a)
r0 = [i[0] for i in B]
r0s = np.concatenate((r0))
idxs = np.searchsorted(a[0],r0s)
cols = np.array([i.shape[1] for i in B])
sp = np.r_[0,cols.cumsum()]
start,stop = sp[:-1],sp[1:]
for (b_i,s0,s1) in zip(B,start,stop):
idx = idxs[s0:s1]
out[s:s+len(b_i)-1,idx] = b_i[1:]
s += len(b_i)-1
return out

Here's an approach using np.searchsorted:
time1=np.arange(1,10)
data1=np.random.randint(200, size=time1.shape)
a=np.array((time1,data1))
# array([[ 1, 2, 3, 4, 5, 6, 7, 8, 9],
# [118, 105, 86, 94, 69, 17, 142, 46, 54]])
time2=np.arange(1,10,2)
data2=np.random.randint(200, size=time2.shape)
b=np.array((time2,data2))
# array([[ 1, 3, 5, 7, 9],
# [70, 15, 4, 97, 57]])
out = np.vstack([a, np.zeros(a.shape[1])])
out[out.shape[0]-1, np.searchsorted(a[0], b[0])] = b[1]
array([[ 1., 2., 3., 4., 5., 6., 7., 8., 9.],
[118., 105., 86., 94., 69., 17., 142., 46., 54.],
[ 70., 0., 15., 0., 4., 0., 97., 0., 57.]])
Update - Merging many matrices
Here's a almost fully vectorised approach for a scenario with multiple b matrices. This approach does not require a priori knowledge of which is the largest list:
def merge_timestamps(*x):
# infer which is the list with maximum length
# as well as individual lengths
concat = np.concatenate(*x, axis=1)[0]
lens = np.r_[np.flatnonzero(np.diff(concat) < 0), len(concat)]
max_len_list = np.r_[lens[0], np.diff(lens)].argmax()
# define the output matrix
A = x[0][max_len_list]
out = np.vstack([A[1], np.zeros((len(*x)-1, len(A[0])))])
others = np.flatnonzero(~np.in1d(np.arange(len(*x)), max_len_list))
# Update the output matrix with the values of the smaller
# arrays according to their index. This is of course assuming
# all values are contained in the largest
for ix, i in enumerate(others):
out[-(ix+1), x[0][i][0]-A[0].min()] = x[0][i][1]
return out
Lets check with the following example:
time1=np.arange(1,10)
data1=np.random.randint(200, size=time1.shape)
a=np.array((time1,data1))
# array([[ 1, 2, 3, 4, 5, 6, 7, 8, 9],
# [107, 13, 123, 119, 137, 135, 65, 157, 83]])
time2=np.arange(1,10,2)
data2=np.random.randint(200, size=time2.shape)
b = np.array((time2,data2))
# array([[ 1, 3, 5, 7, 9],
# [ 81, 49, 83, 32, 179]])
time3=np.arange(1,4,2)
data3=np.random.randint(200, size=time3.shape)
c=np.array((time3,data3))
# array([[ 1, 3],
# [185, 117]])
merge_timestamps([a,b,c])
array([[ 1., 2., 3., 4., 5., 6., 7., 8., 9.],
[107., 13., 123., 119., 137., 135., 65., 157., 83.],
[185., 0., 117., 0., 0., 0., 0., 0., 0.],
[ 81., 0., 49., 0., 83., 0., 32., 0., 179.]])
As mentioned this approach does not require a priori knowledge of which is the largest list, i.e. it would also work with:
merge_timestamps([b, c, a])
array([[ 1., 2., 3., 4., 5., 6., 7., 8., 9.],
[107., 13., 123., 119., 137., 135., 65., 157., 83.],
[185., 0., 117., 0., 0., 0., 0., 0., 0.],
[ 81., 0., 49., 0., 83., 0., 32., 0., 179.]])

Applicable only if sensor is capturing data at fixed interval.
First we will need to create a dataframe with fixed interval (15 min interval in this case), then use concat function to this dataframe with sensor's data.
Code to generate dataframe with 15 min interval (copied)
l = (pd.DataFrame(columns=['NULL'],
index=pd.date_range('2016-09-02T17:30:00Z', '2016-09-02T21:00:00Z',
freq='15T'))
.between_time('07:00','21:00')
.index.strftime('%Y-%m-%dT%H:%M:%SZ')
.tolist()
)
l = pd.DataFrame(l)
Assuming below data comes from sensor
m = (pd.DataFrame(columns=['NULL'],
index=pd.date_range('2016-09-02T17:30:00Z', '2016-09-02T21:00:00Z',
freq='30T'))
.between_time('07:00','21:00')
.index.strftime('%Y-%m-%dT%H:%M:%SZ')
.tolist()
)
m = pd.DataFrame(m)
m['SensorData'] = np.arange(8)
merge above two dataframes
df = l.merge(m, left_on = 0, right_on= 0,how='left')
df.loc[df['SensorData'].isna() == True,'SensorData'] = 0
Output
0 SensorData
0 2016-09-02T17:30:00Z 0.0
1 2016-09-02T17:45:00Z 0.0
2 2016-09-02T18:00:00Z 1.0
3 2016-09-02T18:15:00Z 0.0
4 2016-09-02T18:30:00Z 2.0
5 2016-09-02T18:45:00Z 0.0
6 2016-09-02T19:00:00Z 3.0
7 2016-09-02T19:15:00Z 0.0
8 2016-09-02T19:30:00Z 4.0
9 2016-09-02T19:45:00Z 0.0
10 2016-09-02T20:00:00Z 5.0
11 2016-09-02T20:15:00Z 0.0
12 2016-09-02T20:30:00Z 6.0
13 2016-09-02T20:45:00Z 0.0
14 2016-09-02T21:00:00Z 7.0

Related

Extract a block from an 2d array

Suppose you have a 2D array filled with integers in a continuous manner, going from left to right and top to bottom. Hence it would look like
[[ 0, 1, 2, 3, 4],
[ 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14],
[15, 16, 17, 18, 19]]
Suppose now you have a 1D array of some of the integers shown in the array above. Lets say this array is [6,7,11]. I want to extract the block/chunk of the 2D array that contains the elements of the list. With these two inputs the result should be
[[ 6., 7.],
[11., nan]]
(I am padding with np.nan is it cannot be reshaped)
This is what I have written. Is there a simpler way please?
import numpy as np
def my_fun(my_list):
ids_down = 4
ids_across = 5
layout = np.arange(ids_down * ids_across).reshape((ids_down, ids_across))
ids = np.where((layout >= min(my_list)) & (layout <= max(my_list)), layout, np.nan)
r,c = np.unravel_index(my_list, ids.shape)
out = np.nan*np.ones(ids.shape)
for i, t in enumerate(zip(r,c)):
out[t] = my_list[i]
ax1_mask = np.any(~np.isnan(out), axis=1)
ax0_mask = np.any(~np.isnan(out), axis=0)
out = out[ax1_mask, :]
out = out[:, ax0_mask]
return out
Then trying my_fun([6,7,11]) returns
[[ 6., 7.],
[11., nan]]
This 100% NumPy solution works for both contiguous and non-contiguous arrays of wanted numbers.
a = np.array([[ 0, 1, 2, 3, 4],
[ 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14],
[15, 16, 17, 18, 19]])
n = np.array([6, 7, 11])
Identify the locations of the wanted numbers:
mask = np.isin(a, n)
Select the rows and columns that have the wanted numbers:
np.where(mask, a, np.nan)\
[mask.any(axis=1)][:, mask.any(axis=0)]
#array([[ 6., 7.],
# [11., nan]])
One approach is to look for the bounding boxes by checking which elements in the array are contained in the second list. We can use scipy.ndimage:
from scipy import ndimage
m = np.isin(a, b)
a_components, _ = ndimage.measurements.label(m, np.ones((3, 3)))
bbox = ndimage.measurements.find_objects(a_components)
out = a[bbox[0]]
np.where(np.isin(out, b), out, np.nan)
array([[ 6., 7.],
[11., nan]])
Setup -
a = np.array([[ 0, 1, 2, 3, 4],
[ 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14],
[15, 16, 17, 18, 19]])
b = np.array([6,7,11])
Or for b = np.array([10,12,16]) we'd get:
m = np.isin(a, b)
a_components, _ = ndimage.measurements.label(m, np.ones((3, 3)))
bbox = ndimage.measurements.find_objects(a_components)
out = a[bbox[0]]
np.where(np.isin(out, b), out, np.nan)
array([[10., nan, 12.],
[nan, 16., nan]])
We could also adapt the above for multiple bounding boxes by doing:
b = np.array([5, 11, 8, 14])
m = np.isin(a, b)
a_components, _ = ndimage.measurements.label(m, np.ones((3, 3)))
bbox = ndimage.measurements.find_objects(a_components)
l = []
for box in bbox:
out = a[box]
l.append(np.where(np.isin(out, b), out, np.nan))
print(l)
[array([[ 5., nan],
[nan, 11.]]),
array([[ 8., nan],
[nan, 14.]])]
Taking advantage of the specific form of template array A we can directly transform the test values to coordinates:
A = np.arange(20).reshape(4,5)
test = [6,7,11]
y,x = np.unravel_index(test,A.shape)
yl,yr = y.min(),y.max()
xl,xr = x.min(),x.max()
out = np.full((yr-yl+1,xr-xl+1),np.nan)
out[y-yl,x-xl]=test
out
# array([[ 6., 7.],
# [11., nan]])

Variable Partial Array Summation in Python

I'm looking for a solution to sum per column in a 2D array ("a" in the example below) and starting from a cell position as defined in a different 1D array ("ref" in the example below).
I have tried the following:
import numpy as np
a = np.arange(20).reshape(5, 4)
print(a) # representing an original large 2D array
ref = np.array([0, 2, 4, 1]) # reference array for defining start of sum
s = a.sum(axis=0)
print(s) # Works: sums all elements per column
s = a[2:].sum(axis=0)
print(s) # Works as well: sum from the third element till end per column
# This is what I look for: sum per column starting at element defined by ref[]
s = np.zeros(4).astype(int) # makes an empty 1D array
for i in np.arange(4): # for each column
for j in np.arange(ref[i], 5):
s[i] += a[j, i] # sums all elements from ref till end (i.e. 5)
print(s) # This is the desired outcome
for i in np.arange(4):
s = a[ref[i]:].sum(axis=0)
print(s) # No good; same as a[ref[4]:].sum(axis=0) and here ref[4] = 1
s = np.zeros(4).astype(int) # makes an empty 1D array
for i in np.arange(4):
s[i] = np.sum(a[ref[i]:, i])
print(s) # Yes; this is also the desired outcome
Is it possible to realize this without using a for loop?
Does numpy have functions for doing this in a single step?
s = a[ref:].sum(axis=0)
This would be nice, but is not working.
Thank you for your time!
A basic solution based on np.cumsum:
In [1]: a = np.arange(15).reshape(5, 3)
In [2]: res = np.array([0, 2, 3])
In [3]: b = np.cumsum(a, axis=0)
In [4]: b
Out[4]:
array([[ 0, 1, 2],
[ 3, 5, 7],
[ 9, 12, 15],
[18, 22, 26],
[30, 35, 40]])
In [5]: a
Out[5]:
array([[ 0, 1, 2],
[ 3, 4, 5],
[ 6, 7, 8],
[ 9, 10, 11],
[12, 13, 14]])
In [6]: b[res, np.arange(a.shape[1])]
Out[6]: array([ 0, 12, 26])
In [7]: b[-1, :] - b[res, np.arange(a.shape[1])]
Out[7]: array([30, 23, 14])
so it does not give us the result we want: we need to add a first line of zeros to b:
In [13]: b = np.vstack([np.zeros((1, a.shape[1])), b])
In [14]: b
Out[14]:
array([[ 0., 0., 0.],
[ 0., 1., 2.],
[ 3., 5., 7.],
[ 9., 12., 15.],
[ 18., 22., 26.],
[ 30., 35., 40.]])
In [17]: b[-1, :] - b[res, np.arange(a.shape[1])]
Out[17]: array([ 30., 30., 25.])
which is, I believe, the desired output.

Output in scipy.stats.binned_statistic_dd()

I am trying to use scipy.stats.binned_statistic_dd and I can't for the life of me figure out the outputs. Does anyone have any advice here?
Look at this simple sample program:
import scipy
scipy.__version__
# '0.14.0'
import numpy as np
print scipy.stats.binned_statistic_dd([np.ones(10), np.ones(10)], np.arange(10), 'count', bins=3)
#(array([[ 0., 0., 0.],
# [ 0., 10., 0.],
# [ 0., 0., 0.]]),
# [array([ 0.5 , 0.83333333, 1.16666667, 1.5 ]),
# array([ 0.5 , 0.83333333, 1.16666667, 1.5 ])],
# array([12, 12, 12, 12, 12, 12, 12, 12, 12, 12]))
So the documentation claims the outputs are:
statistic : ndarray, shape(nx1, nx2, nx3,...) The values of the
selected statistic in each two-dimensional bin
edges : list of
ndarrays A list of D arrays describing the (nxi + 1) bin edges for
each dimension
binnumber : 1-D ndarray of ints This assigns to each
observation an integer that represents the bin in which this
observation falls. Array has the same length as values.
In the example the statistic makes good sence, I asked for the 'count' and got 10, there are 10 elements all in that same bin. Edges makes good sense too, the data to be over was a dimension 2 and I wanted 3 bins so I gotout 4 edges that are reasonable.
Then the question the binnumber makes no sense to me at all, array([12, 12, 12, 12, 12, 12, 12, 12, 12, 12]), there are indeed 10 numbers the same length and the data inputted, np.arange(10), but number 12 makes no sense at all. What am I missing. 12 is not an unravel index over the bins turned into a multi D array, since there are 3 bins in each dimension I could see numbers up to 9. What is 12 telling me?
The values in binnumbers are an unraveled index of bins that include an extra
set of "out of range" bins.
In this example,
In [40]: hst, edges, bincounts = binned_statistic_dd([np.ones(10), np.ones(10)], None, 'count', bins=3)
In [41]: hst
Out[41]:
array([[ 0., 0., 0.],
[ 0., 10., 0.],
[ 0., 0., 0.]])
the bins are numbered as follows:
0 | 1 | 2 | 3 | 4
-----+-----+-----+-----+-----
5 | 6 | 7 | 8 | 9
-----+-----+-----+-----+-----
10 | 11 | 12 | 13 | 14
-----+-----+-----+-----+-----
15 | 16 | 17 | 18 | 19
-----+-----+-----+-----+-----
20 | 21 | 22 | 23 | 24
The "out of range" bins are not included in hst; the data in hst corresponds to bin numbers
6, 7, 8, 11, 12, 13, 16, 17 and 18. That's why all the values in bincounts are 12:
In [42]: bincounts
Out[42]: array([12, 12, 12, 12, 12, 12, 12, 12, 12, 12])
You can use the range argument to force the counts into the outer bins. For example,
by setting the ranges of the coordinates to be [2, 3] and [0, 0.5], so all the values in the
first coordinate are left of their range and all the values in the second coordinate are
to the right of their range, all the points end up in the upper right outer bin, which is
bin index 4:
In [51]: binned_statistic_dd([np.ones(10), np.ones(10)], None, 'count', bins=3, range=[[2,3],[0,0.5]])
Out[51]:
(array([[ 0., 0., 0.],
[ 0., 0., 0.],
[ 0., 0., 0.]]),
[array([ 2. , 2.33333333, 2.66666667, 3. ]),
array([ 0. , 0.16666667, 0.33333333, 0.5 ])],
array([4, 4, 4, 4, 4, 4, 4, 4, 4, 4]))

variable assignment: keep shape

...better to directly show the code. Here it is:
import numpy as np
a = np.zeros([3, 3])
a
array([[ 0., 0., 0.],
[ 0., 0., 0.],
[ 0., 0., 0.]])
b = np.random.random_integers(0, 100, size = (1, 3))
b
array([[ 10, 3, 8]])
c = np.random.random_integers(0, 100, size = (4, 3))
c
array([[ 22, 21, 14],
[ 55, 64, 12],
[ 33, 85, 98],
[ 37, 44, 45]])
a = b will change dimensions of a
a = c will change dimensions of a
for a = b, I want:
array([[ 10., 3., 8.],
[ 0., 0., 0.],
[ 0., 0., 0.]])
and for a = c, I want:
array([[ 22, 21, 14],
[ 55, 64, 12],
[ 33, 85, 98]])
So I want to lock the shape of 'a' so that values being assigned to it get "cropped" if necessary. Of course without if statements.
The problem is that the equal operator is making a shallow copy of the array, and what you want is a deep copy of part of the array.
So for this, if you know that b only has one outer array, then you can do:
a[0] = b
And if know that a is a 3x3, then you could also do:
a = c[0:3]
Furthermore, if you want them to be actual deep copies, you'll want:
a[0] = b.copy()
and
a = c[0:3].copy()
To make them independent.
If you don't already know the lengths of the matrices, you can use the len() function to find out at runtime.
You can do this easily by using Numpy slice notation. Here is a SO question with good answers explaining it clearly. Essentially, you need to ensure that the shape of the left hand array and the right had array match, and you can achieve this by slicing the corresponding arrays appropriately.
import numpy as np
a = np.zeros([3, 3])
b = np.array([[ 10, 3, 8]])
c = np.array([[ 22, 21, 14],
[ 55, 64, 12],
[ 33, 85, 98],
[ 37, 44, 45]])
a[0] = b
print a
a = c[0:3]
print a
Output:
[[ 10. 3. 8.]
[ 0. 0. 0.]
[ 0. 0. 0.]]
[[22 21 14]
[55 64 12]
[33 85 98]]
It seems you want to replace elements in the top left of a 2D array with elements from a second 2D array without worrying about the sizes of the arrays. Here is a method:
def replacer(orig, repl):
new = np.copy(orig)
w2, h1 = new.shape
w1, h2 = repl.shape
new[0:min(w1,w2), 0:min(h1,h2)] = repl[0:min(w1,w2), 0:min(h1,h2)]
return new
print replacer(a,b)
print replacer(a,c)

Rewrite a double loop in a nicer and maybe shorter way

I am wondering if the following code can be written in a somewhat nicer way. Basically, I want to calculate z = f(x, y) for a (x, y) meshgrid.
a = linspace(0, xr, 100)
b = linspace(0, yr, 100)
for i in xrange(100):
for j in xrange(100):
z[i][j] = f(a[i],b[j])
Yeah. Your code as presented in the question is nice.
Do not ever think that few lines is "nice" or "cool". What counts is clarity, readability and maintainability. Other people should be able to understand your code (and you should understand it in 12 months, when you need to find a bug).
Many programmers, especially young ones, think that "clever" solutions are desirable. They are not. And that's what is so nice with the python community. We are much less afflicted by that mistake than others.
you could do something like
z = [[f(item_a, item_b) for item_b in b] for item_a in a]
You could use itertools' product:
[f(i,j) for i,j in product( a, b )]
and if you really want to shrink those 5 lines into 1 then:
[f(i,j) for i,j in product( linspace(0,xr,100), linspace(0,yr,100)]
To take it even further if you want a function of xr and yr where you can also preset the ranges of 0 and 100 to something else:
def ranged_linspace( _start, _end, _function ):
def output_z( xr, yr ):
return [_function( i, j ) for i,j in product( linspace( _start, xr, _end ), linspace( _start, yr, _end ) )]
return output_z
If you set it all at once, you can use a list comprehension;
[[f(a[i], b[j]) for j in range(100)] for i in range(100)]
If you need to use a z that's already there, however, you can't do that and your code is about the neatest you'll get.
Addition: I don't know with what this lingrid does, but if it produces a 100-element list, use aaronasterling's list comprehension; no point in creating an extra iterator if you don't need to.
This shows the general result. a is made into a list 6-long and b is 4-long. The result is a list of 6 lists, and each nested list is 4 elements long.
>>> def f(x,y):
... return x+y
...
>>> a, b = list(range(0, 12, 2)), list(range(0, 12, 3))
>>> print len(a), len(b)
6 4
>>> result = [[f(aa, bb) for bb in b] for aa in a]
>>> print result
[[0, 3, 6, 9], [2, 5, 8, 11], [4, 7, 10, 13], [6, 9, 12, 15], [8, 11, 14, 17], [10, 13, 16, 19]]
I think this is the one line code that you looking for
z = [[a+b for b in linspace(0,yr,100)] for a in linspace(0,xr,100)]
Your linspace actually looks like it could be np.linspace. If it is you could operate on the numpy arrays without having to iterate explicitly:
z = f(x[:, np.newaxis], y)
For example:
>>> import numpy as np
>>> x = np.linspace(0, 9, 10)
>>> y = np.linspace(0, 90, 10)
>>> x[:, np.newaxis] + y # or f(x[:, np.newaxis], y)
array([[ 0., 10., 20., 30., 40., 50., 60., 70., 80., 90.],
[ 1., 11., 21., 31., 41., 51., 61., 71., 81., 91.],
[ 2., 12., 22., 32., 42., 52., 62., 72., 82., 92.],
[ 3., 13., 23., 33., 43., 53., 63., 73., 83., 93.],
[ 4., 14., 24., 34., 44., 54., 64., 74., 84., 94.],
[ 5., 15., 25., 35., 45., 55., 65., 75., 85., 95.],
[ 6., 16., 26., 36., 46., 56., 66., 76., 86., 96.],
[ 7., 17., 27., 37., 47., 57., 67., 77., 87., 97.],
[ 8., 18., 28., 38., 48., 58., 68., 78., 88., 98.],
[ 9., 19., 29., 39., 49., 59., 69., 79., 89., 99.]])
But you could also use np.ogrid instead of two linspace:
import numpy as np
>>> x, y = np.ogrid[0:10, 0:100:10]
>>> x + y # or f(x, y)
array([[ 0, 10, 20, 30, 40, 50, 60, 70, 80, 90],
[ 1, 11, 21, 31, 41, 51, 61, 71, 81, 91],
[ 2, 12, 22, 32, 42, 52, 62, 72, 82, 92],
[ 3, 13, 23, 33, 43, 53, 63, 73, 83, 93],
[ 4, 14, 24, 34, 44, 54, 64, 74, 84, 94],
[ 5, 15, 25, 35, 45, 55, 65, 75, 85, 95],
[ 6, 16, 26, 36, 46, 56, 66, 76, 86, 96],
[ 7, 17, 27, 37, 47, 57, 67, 77, 87, 97],
[ 8, 18, 28, 38, 48, 58, 68, 78, 88, 98],
[ 9, 19, 29, 39, 49, 59, 69, 79, 89, 99]])
It somewhat depends on what you're f is. If it contains functions like math.sin you need to replace them by numpy.sin.
If it's not about numpy then you should stick either with your option or optionally using enumerate when looping:
for idx1, ai in enumerate(a):
for idx2, bj in enumerate(b):
z[idx1][idx2] = f(ai, bj)
This has the advantage that you don't need to hardcode your range (or xrange) or use the len(a) as input. But in general if there is not a huge performance difference 1 then use the method you and others using your code will understand easily.
1 If your a and b are numpy.arrays then there would be a significant performance difference because numpy can process the arrays much faster if no list<->numpy.array conversions are required.

Categories

Resources