Is there an efficient function to calculate a product? - python

I'm looking for a numpy function (or a function from any other package) that would efficiently evaluate
with f being a vector-valued function of a vector-valued input x. The product is taken to be a simple component-wise multiplication.
The issue here is that both the length of each x vector and the total number of result vectors (f of x) to be multiplied (N) is very large, in the order of millions. Therefore, it is impossible to generate all the results at once (it wouldn't fit in memory) and then multiply them afterwards using np.multiply.reduce or the like .
A toy example of the type of code I would like to replace is:
import numpy as np
x = np.ones(1000000)
prod = f(x)
for i in range(2, 1000000):
prod *= f(i * np.ones(1000000))
with f a vector-valued function with the dimension of its output equal to the dimension of its input.
To be sure: I'm not looking for equivalent code, but for a single, highly optimized function. Is there such a thing?
For those familiar with Wolfram Mathematica: It would be the equivalent to Product. In Mathematica, I would be able to simply write Product[f[i ConstantArray[1,1000000]],{i,1000000}].

Numpy ufuncs all have a reduce method. np.multiply is a ufunc. So it's a one-liner:
np.multiply.reduce(v)
Where v is the vector of values you compute in what is hopefully an equally efficient manner.
To compute the vector, just apply your function to the input:
v = f(x)
So with your example:
np.multiply.reduce(np.sin(x))
Alternative
A simpler way to phrase the same thing is np.prod:
np.prod(v)
You can also use the prod method directly on your vector:
v.prod()

Related

finite difference derivative array valued functions

Suppose I have the following code
import numpy as np
f = lambda x,y: (np.sum(x) + np.sum(y))**2
x = np.array([1,2,3])
y = np.array([4,5,6])
df_dx
df_dy
df2_dx2
df2_dxdy
...
is there a fast way to compute all the derivatives (single and mixed) of such a function? The module should perform the classical finite difference technique at the array level, ie adding h= tol elementwise to the array variables (depending on the derivative), computing the function and dividing by h.
(My real case is much more complicated as it involves an array valued function coming from a DLL I can't modify...the number of variables is arbitrary, please do not focus on this particular toy example)

Derivative of a sum in Theano

So I want to calculate the gradient and Hessian of the following sum. Afaik Theano should be able to do that, however I can't figure out how.
X is a Matrix of size M x N; y M sized vector; beta a N sized vector.
One way to compute the sum is using the scan() function, which I did like this:
res,ups = theano.scan(lambda v,w: v*np.log(1/(1+np.exp(-1*w.dot(beta))))
+((1-v)*(np.log(1/(1+np.exp(w.dot(beta)))))), sequences = [y,X])
t7 = theano.function(inputs = [X,y,beta],outputs = res)
and that works fine as far as I can tell. However, I can't use this as an Input for the grad() function with respect to beta.
So what I would like to know is if there is a way to either use the scan function as input of the grad function or a different way to compute the sum.
(I first tried in sympy, but sympy can't lambdify Indexedbase objects, so I can compute the grad but can't use it as a function, maybe that helps? )
The Sum adds up a function of the Dot Product of a line in X and beta while the binary vector y decides which of two functions will be used.
log(1/(1+exp(-X_i*beta)))
Hope that helps?

Linear combination of function objects in python

Problem: I want to numerically integrate a function f(t,N) that may be written as a linear combination of N other known functions g_1(t), ..., g_N(t).
My Solution I: I know the functions g_i and also the coefficients, so my initial idea was to create an row vector of coefficients and a column vector containing the lambda functions g_i and then use np.dot for the inner product to get the function object I want. Unfortunately, you cannot just add two function objects nor multiply a function object by a scalar.
My Solution II: Of course I can do something like (basically defining point wise what I want):
def f(t,N,a,g):
"""
a = numpy array of coefficients
g = numpy array of lambda functions corresponding to functions g_i
"""
res = 0
for i in xrange(N):
res += a[i] * g[i](t)
return res
But the for loop is of course not very great, especially when:
I need to run this function at many many time steps t
I pass this function f into a numerical integration routine like scipy.integrate.quad.
briefly:
In Cython You could speed up indexing using memoryviews.
If these equations are linear You could superimpose them using sympy:
example:
import sympy as sy
x,y = sy.symbols('x y')
g0 = x*0.33 + 6
g1 = x*0.72 + 1.3
g2 = x*11.2 - 6.5
gn = x*3.3 - 7.3
G = [g0,g1,g2,gn]
#this is superimposition
print sum(G).subs(x,15.1)
print sum(gi.subs(x,15.1) for gi in G)
'''
output:
228.305000000000
228.305000000000
'''
If its not what You want, give some example input and output, so that I can try and dont go blind...
With low ram avaiable You could get finall equation to numexpr and evaluate it with some input. Otherwise its best to work on numpy arrays.

FFT convolution not being faster than the cannonical convolution computation

A couple of months ago I found out that convolutions are computed in the fastest possible way using the FFT algorithm (even more with the FFTW library)
Using the following code I have controversial results.
Imports
from scipy import fftpack
from numba import jit
Convolution with FFT:
def conv_fft(X, R):
n = len(X)
a = fftpack.fft(X)
b = fftpack.fft(R)
c = a * b
e = fftpack.ifft(c)
result = e[n]
return result
Convolution using the formula:
#jit(cache=True)
def conv(X, R):
n = len(X)
result = complex_type(0)
for i in range(n+1):
result += X[n-i] * R[i]
return result
This are critical functions in a much complex process, the difference arises only by using one version or the other.
no FFT with FFT increment
Test1 0.028761 0.034139 0.0053780
Test2 0.098565 0.103180 0.0046150
** the test2 computes more convolutions per test.*
The test show that the code with FFT is slower and I cannot see why since the fftpack apparently call the FFTW library which is "the fastest in the west"...
Any guidance is appreciated.
A conclusion for my is that the numba JIT compilation is unbelievably fast.
You're only returning a single value (the n:th one) of the convolution, not the full array. With FFT you always calculate all values, whereas in your conv function you only calculate the one you're after. Complexity-wise, the FFT is O(N*log(N)), and your implementation of conv is O(N). If you would implement a naive conv function that would return the full convolution, it would be O(N^2).
So, if you want the full convoluted array your best bet is the FFT way of doing it. If you only want the n:th value, your method is complexity wise the best.
You should be able to get away with creating fewer temporary arrays, using this type of syntax, which should make it faster.
def conv_fft(X, R):
fftpack.fft(X, overwrite_x=True)
b = fftpack.fft(R)
X *= b
fftpack.ifft(X, overwrite_x=True)
return X

Numpy linalg on multidimensional arrays

Is there a way to use numpy.linalg.det or numpy.linalg.inv on an nx3x3 array (a line in a multiband image), for example? Right now I am doing something like:
det = numpy.array([numpy.linalg.det(i) for i in X])
but surely there is a more efficient way. Of course, I could use map:
det = numpy.array(map(numpy.linalg.det, X))
Any other more direct way?
I'm pretty sure there is no substantially more efficient way than what you have. You can save some memory by first creating an empty array for the results and writing all results directly to that array:
res = numpy.empty_like(X)
for i, A in enumerate(X):
res[i] = numpy.linalg.inv(A)
This won't be any faster, though -- it will only use less memory.
a "normal" determinant is only defined for a matrix (dimension=2), so if that's what you want i don't see another way.
if you really want to compute the determinant of a cube then you could try to implement one of the ways described here:
http://en.wikipedia.org/wiki/Hyperdeterminant
notice that it is not necessarily the same value as the one you're currently computing.
New answer to an old question: Since version 1.8.0, numpy supports evaluating a batch of 2D matrices. For a batch of MxM matrices, the input and output now looks like:
linalg.det(a)
Compute the determinant of an array.
Parameters a(…, M, M) array_like
Input array to compute determinants for.
Returns det(…) array_like
Determinant of a.
Note the ellipsis. There can be multiple "batch dimensions", where for example you can evaluate a determinants on a meshgrid.
https://numpy.org/doc/stable/reference/generated/numpy.linalg.det.html
https://numpy.org/doc/stable/reference/generated/numpy.linalg.inv.html

Categories

Resources