I was unable to find anything describing how to do this, which leads to be believe I'm not doing this in the proper idiomatic Python way. Advice on the 'proper' Python way to do this would also be appreciated.
I have a bunch of variables for a datalogger I'm writing (arbitrary logging length, with a known maximum length). In MATLAB, I would initialize them all as 1-D arrays of zeros of length n, n bigger than the number of entries I would ever see, assign each individual element variable(measurement_no) = data_point in the logging loop, and trim off the extraneous zeros when the measurement was over. The initialization would look like this:
[dData gData cTotalEnergy cResFinal etc] = deal(zeros(n,1));
Is there a way to do this in Python/NumPy so I don't either have to put each variable on its own line:
dData = np.zeros(n)
gData = np.zeros(n)
etc.
I would also prefer not just make one big matrix, because keeping track of which column is which variable is unpleasant. Perhaps the solution is to make the (length x numvars) matrix, and assign the column slices out to individual variables?
EDIT: Assume I'm going to have a lot of vectors of the same length by the time this is over; e.g., my post-processing takes each log file, calculates a bunch of separate metrics (>50), stores them, and repeats until the logs are all processed. Then I generate histograms, means/maxes/sigmas/etc. for all the various metrics I computed. Since initializing 50+ vectors is clearly not easy in Python, what's the best (cleanest code and decent performance) way of doing this?
If you're really motivated to do this in a one-liner you could create an (n_vars, ...) array of zeros, then unpack it along the first dimension:
a, b, c = np.zeros((3, 5))
print(a is b)
# False
Another option is to use a list comprehension or a generator expression:
a, b, c = [np.zeros(5) for _ in range(3)] # list comprehension
d, e, f = (np.zeros(5) for _ in range(3)) # generator expression
print(a is b, d is e)
# False False
Be careful, though! You might think that using the * operator on a list or tuple containing your call to np.zeros() would achieve the same thing, but it doesn't:
h, i, j = (np.zeros(5),) * 3
print(h is i)
# True
This is because the expression inside the tuple gets evaluated first. np.zeros(5) therefore only gets called once, and each element in the repeated tuple ends up being a reference to the same array. This is the same reason why you can't just use a = b = c = np.zeros(5).
Unless you really need to assign a large number of empty array variables and you really care deeply about making your code compact (!), I would recommend initialising them on separate lines for readability.
Nothing wrong or un-Pythonic with
dData = np.zeros(n)
gData = np.zeros(n)
etc.
You could put them on one line, but there's no particular reason to do so.
dData, gData = np.zeros(n), np.zeros(n)
Don't try dData = gData = np.zeros(n), because a change to dData changes gData (they point to the same object). For the same reason you usually don't want to use x = y = [].
The deal in MATLAB is a convenience, but isn't magical. Here's how Octave implements it
function [varargout] = deal (varargin)
if (nargin == 0)
print_usage ();
elseif (nargin == 1 || nargin == nargout)
varargout(1:nargout) = varargin;
else
error ("deal: nargin > 1 and nargin != nargout");
endif
endfunction
In contrast to Python, in Octave (and presumably MATLAB)
one=two=three=zeros(1,3)
assigns different objects to the 3 variables.
Notice also how MATLAB talks about deal as a way of assigning contents of cells and structure arrays. http://www.mathworks.com/company/newsletters/articles/whats-the-big-deal.html
If you put your data in a collections.defaultdict you won't need to do any explicit initialization. Everything will be initialized the first time it is used.
import numpy as np
import collections
n = 100
data = collections.defaultdict(lambda: np.zeros(n))
for i in range(1, n):
data['g'][i] = data['d'][i - 1]
# ...
How about using map:
import numpy as np
n = 10 # Number of data points per array
m = 3 # Number of arrays being initialised
gData, pData, qData = map(np.zeros, [n] * m)
Related
I want to understand how is einsum function in python implemented. I found the source code in numpy/core/src/multiarray/einsum.c.src file but couldn't completely understand it. In particular I want to understand how does it creates the required loops automatically?
For example:
import numpy as np
a = np.random.rand(2,3,4,5)
b = np.random.rand(5,3,2,4)
ll = np.einsum('ijkl, ljik ->', a,b) # This should loop over all the
# four indicies i,j,k,l. How does it create loops for these indices automatically ?
# The assume that under the hood it does the following
sum1 = 0
for i in range(2):
for j in range(3):
for k in range(4):
for l in range(5):
sum1 = sum1 + a[i,j,k,l]*b[l,j,i,k]
Thank you in advance
ps: This question is not about how to use numpy.einsum
I want to understand how does it creates the required loops automatically?
Well, it does not create the loops the way you think it does. In this case, it creates an iterator operating over multiple arrays and then use it in a generic main loop. In the more general case, there are two main loops: one to iterate over the output array items and one to perform a reduction.
The main function is PyArray_EinsteinSum. In your case, it takes an unoptimized path and end up creating a basic iteration function based on the iterator created previously (ie. iter). This function is get_sum_of_products_function. It basically analyze the einsum operation so to find the best (sum of product) function to call based on a lookup table (like _outstride0_specialized_table). In your specific case, double_sum_of_products_outstride0_two is called. Numpy use a template system so to generate this function automatically at build time (*.c.src files are template files converted to *.c files based on predefined basic comments). In this case, the function is generated from #name#_sum_of_products_outstride0_#noplabel# and once computed by the C preprocessor it gives something like the following function:
static void double_sum_of_products_outstride0_two(int nop,
char **dataptr,
npy_intp const *strides,
npy_intp count)
{
npy_double accum = 0;
char *data0 = dataptr[0];
npy_intp stride0 = strides[0];
char *data1 = dataptr[1];
npy_intp stride1 = strides[1];
while (count--)
{
accum += (*(npy_double *)data0) * (*(npy_double *)data1);
data0 += stride0;
data1 += stride1;
}
*((npy_double *)dataptr[2]) = (accum + (*((npy_double *)dataptr[2])));
}
As you can see, there is only one main loop iterating over the previously generated iterator. In your case, stride0 and stride1 are both equal to 8, data0 and data1 are the raw input arrays, dataptr is the raw output array and count is set to 120 initially. Note that the fact both strides are equal to 8 is surprising at first glance since the einsum does not iterate on the two array contiguously. This is because the second array is copied and reorder because Numpy cannot create a uniform view based on the einsum parameters.
Note that the fallback case use for the example code is not particularly optimized and it only produce one value. For example, the much more optimized double_sum_of_products_contig_contig_outstride0_two function can be called from unbuffered_loop_nop2_ndim2 for the following code:
import numpy as np
a = np.random.rand(3, 10)
b = np.random.rand(3, 10)
for i in range(1):
ll = np.einsum('ij, ij -> i', a, b)
In this case, the double_sum_of_products_contig_contig_outstride0_two performs the reductions for a given output item and unbuffered_loop_nop2_ndim2 iterate over the output array.
If the expression ij, ij -> j is instead used in the above code, then the function double_sum_of_products_contig_two is called which operates the same way than double_sum_of_products_contig_contig_outstride0_two except it reads/writes on the whole output line during the reduction.
I am having some trouble with optimizing the following for loops. I have looked into the map function, comprehension expressions, and a bit of itertools and generators, but am unsure at how to approach optimization for nested loops. Any help or suggestions is greatly appreciated. Thanks in advance!
Note that object architecture is:
self.variables.parameters.index1/index2/value
self.variables.rate
self.state.num
Loop 1:
mat1 = np.zeros(m, n)
for idx, variable in enumerate(self.variables):
for parameter in variable.parameters:
tracking_idx = parameter.index1 + parameter.index2
mat1[tracking_idx, idx] = parameter.value
Loop 2:
mat2 = []
for variable in self.variables:
rate = variable.rate
for parameter in variable.parameters:
if parameter.value < 0 and self.state.num[parameter.index1, parameter.index2] <= 0:
rate = 0
mat2.append(rate)
With a numpy tag I assume 'optimize' means cast these loops as compiled numpy array expressions. I don't think there's a way to do without first collecting all the data as numpy arrays, which will require the same loops.
You have a list of n variables. Each variable has a list of ? parameters. Each parameter has 3 or 4 attributes.
So
rates = np.array([v.rate for v in variables])
values = np.array([[p.value for p in v.parameters] for v in variables]
index1s = <dito> (?,n) array
index2s = <dita> (?,n) array
self.state.num is already a 2d array, size compatible with the range of values in index1s and index2s.
Given those 1 and 2d arrays we should be able to derive mat1 and mat2 with whole array operations. If ? is small relative to m and the range of values in index1 and index2 it might be worth do this. I don't have a realistic feel for your data.
===========
You mention the map function, comprehension expressions, and a bit of itertools and generators. These can make the code look a little cleaner but don't make much difference in speed.
I demonstrated the use of list comprehensions. Those expression can be more elaborate, but often at the cost of readability. I like writing helper functions to hide details. A generator comprehension can replace a list comprehension that feeds another. Maps cover the same functionality.
Since variables and parameters have attributes I assume you have defined them in classes. You could write methods that extract those attributes as simple lists or arrays.
class Variable(....):
....
def get_values(self):
return [p.value for p in self.parameters]
def get_rate(self, state):
rate = self.rate
for parameter in self.parameters:
if parameter.value < 0 and
state.num[parameter.index1, parameter.index2] <= 0:
rate = 0
return rate
values = [v.get_values() for v in variables]
rates = [v.get_rate(self.state) for v in variables]
You could even write these helper function without the class structure.
That doesn't speed up anything; it just hides some details in the object.
This is more of a Matlab programming question than it is a math question.
I'd like to run gradient descent multiple on different learning rates. I have a set of learning rates
alpha = [0.3, 0.1, 0.03, 0.01, 0.003, 0.001];
and each time I run gradient descent, I get a vector J_vals as output. However, I don't know Matlab well enough to know how to implement this besides doing something like:
[theta, J_vals] = gradientDescent(...., alpha(1),...);
J1 = J_vals;
[theta, J_vals] = gradientDescent(...., alpha(2),...);
J2 = J_vals;
and so on.
I thought about using a for loop, but then I don't know how I would deal with the J_vals's (not sure how to apply the for loop to J1, J2, and so on). Perhaps it would look something like this:
for i = len(alpha)
[theta, J_vals] = gradientDescent(..., alpha(i),...);
J(i) = J_vals;
end
Then I would have a vector of vectors.
In Python, I would just run a for loop and append each new result to the end of a list. How do I implement something like this in Matlab? Or is there a more efficient way?
If you know how many loops you are going have and the size of the J_vals (or at least a reasonable upper bound) I would suggest pre-allocating the size of the container array
J = zeros(n,1);
then on each loop insert the new values
J(start:start+n) = J_vals
That way you don't reallocate memory. If you don't know, you can append the values to the array. For example,
J = []; % initialize
for i = len(alpha)
[theta, J_vals] = gradientDescent(..., alpha(i),...);
J = [J; J_vals]; % Append column row
end
but this is re-allocating the size of the array every loop. If it's not too many loops then it should be ok.
Matlab's "cell arrays" are kind of like lists in Python. They are similar in that you can put variable datatypes into them. Nobody seems to be too sure, but most likely the cell array is implemented as an array of object pointers. That means that it is still somewhat expensive to append to it (cell_array{length(cell_array) + 1} = new_data), but at least you are only appending a pointer instead of the entire column. You would still have to convert the cell array to a normal matrix afterward using cell2mat.
The most idiomatic Matlab solution is to pre-allocate (as #dpmcmlxxvi suggested).
I think what you are describing is a really common use case, and it's unfortunate that Matlab requires such a verbose idiom for this. Also it's frustrating that the documentation is opaque on how cell arrays are implemented and whether it is expensive to append to a cell array.
Your solution works just fine as long as you add a : for the row subscript (assuming J_vals is a column vector):
for i = len(alpha)
[theta, J_vals] = gradientDescent(..., alpha(i),...);
J(:, i) = J_vals;
%// ^... all rows, column 'i'
end
You could even put that as the return value:
for i = len(alpha)
[theta, J(:, i)] = gradientDescent(..., alpha(i),...);
%// ^... add returned value directly to our list
end
Both of these methods allow you to preallocate your matrix for a potential speed gain.
If you want to build your list as you go, you can use the method in #dpmcmlxxvi's answer, or you can use the special subscript end. Neither of these methods are compatible with preallocation, though.
for i = len(alpha)
[theta, J(:, end+1)] = gradientDescent(..., alpha(i),...);
%// ^... add new vector after the current end of list
end
I would also like to suggest you not use i as a variable name in Matlab. I know it's natural for other languages, but in Matlab it overwrites the built-in imaginary constant i.
See: https://stackoverflow.com/a/14790765/1377097
Given an array of integer counts c, how can I transform that into an array of integers inds such that np.all(np.bincount(inds) == c) is true?
For example:
>>> c = np.array([1,3,2,2])
>>> inverse_bincount(c) # <-- what I need
array([0,1,1,1,2,2,3,3])
Context: I'm trying to keep track of the location of multiple sets of data, while performing computation on all of them at once. I concatenate all the data together for batch processing, but I need an index array to extract the results back out.
Current workaround:
def inverse_bincount(c):
return np.array(list(chain.from_iterable([i]*n for i,n in enumerate(c))))
using numpy.repeat :
np.repeat(np.arange(c.size), c)
no numpy needed :
c = [1,3,2,2]
reduce(lambda x,y: x + [y] * c[y], range(len(c)), [])
The following is about twice as fast on my machine than the currently accepted answer; although I must say I am surprised by how well np.repeat does. I would expect it to suffer a lot from temporary object creation, but it does pretty well.
import numpy as np
c = np.array([1,3,2,2])
p = np.cumsum(c)
i = np.zeros(p[-1],np.int)
np.add.at(i, p[:-1], 1)
print np.cumsum(i)
So, I'm looking at python and I have a large 2d numpy array of data, and I want to take m rows of this large data matrix. I've looked into random.sample, and numpy.random.shuffle and numpy.random.permutation, all of these work, but usually they return the whole permutation or at least generate the entire range(n). If I had a very large dataset, then doing something like
data = numpy.random.uniform((n,100))
myvec = data[random.sample(range(n),m),:]
will allocate a vector range(n) which blows up pretty fast. So i thought I could use xrange, which return a generator, but hey, you can't just get any element from an generator, that's not the way they work.
I tried it out, and it works.
data = numpy.random.uniform((n,100))
myvec = data[random.sample(xrange(n),m),:]
Any idea how?
UPDATE:
I can use
samp = random.sample(range(n),10)
for n up to 100000000 before I get a memory error. If i use
samp = random.sample(xrange(n),10)
on the other hand, I only start getting errors because of int converson to C, namely, the int gets too long to get converted to C, at around 1000000000. Sure it's only a factor of 10, but I'm curious. the xrange variant is also much faster.
def sample(n, m):
d = set()
while len(d) < m:
d.add(randrange(n))
return d
>>> sample(100000000000000000000000000000000000, 10)
set([5577049102993258248888250482046894L, 86044086231860190654588187118815513L, 2021737354726858669049814270580972L, 6253501639432326715043836478191628L, 5306460388221333758367322518700483L, 62195356583363524099133566314034473L, 376650426515181012918370326724858L, 80588135672357701239461833469588557L, 1978959860575617450893346333245569L, 41904683348442252013350548717573039L])
Note that simple {randrange(n) for _ in range(m)} will do the job with very high probability.
So it turns out xrange and iterators can be accessed by indexing, which is exactly what random.sample() uses. So that's how it works.
a = xrange(10)
print a[5] #this works.
Elazar's solution works just as well though.