I will have to implement a convolution of two functions in Python, but SciPy/Numpy appear to have functions only for the convolution of two arrays.
Before I try to implement this by using the the regular integration expression of convolution, I would like to ask if someone knows of an already available module that performs these operations.
Failing that, which of the several kinds of integration that SciPy provides is the best suited for this?
Thanks!
You could try to implement the Discrete Convolution if you need it point by point.
Yes, SciPy/Numpy is mostly concerned about arrays.
If you can tolerate an approximate solution, and your functions only operate over a range of value (not infinite) you can fill an array with the values and convolve the arrays.
If you want something more "correct" calculus-wise you would probably need a powerful solver (mathmatica, maple...)
Related
My problem is to perform 3 matrix multiplications on a 3D numpy array A too large to fit in a single processor. In tensorial form I want A_ijk B_km C_jn D_ip (B, C, and D can all fit in memory). I want to know if dask is appropriate for this task (or if another tool might be more suited).
I believe the best approach is to split this operation into each multiplication, and make sure that they are all local. This link has a really useful diagram that summarises what I'm talking about http://www.2decomp.org/1d_mode.html.
In more detail: First, to do A_ijk B_km, I should distribute A over the first two axes, and perform the matrix multiplication over each pencil locally (the first step in the diagram).
Then, I need to transpose the array, making the j axis local to each processor (and splitting over the k (now m) axis), to then perform the next multiplication. (So going from the first to the second step in the diagram). This is where I wonder if dask could help.
I'm aware that this can be done in principle using mpi4py, but the steps are pretty non-trivial, whereas dask arrays have helpful rechunk and transpose methods, which feel relevant to this application.
Does this seem like something well-suited to dask?
If not, is anyone aware of any python libraries that can perform these steps? I know that fftw has routines for doing just this, but I don't know how to write the C-code necessary, or how to get it to interface with python and numpy.
Thanks for any help.
For anyone else in the future, mpi4py does have a transpose method. But it's called Alltoall/Alltoallv. It's not explained in the documentation or tutorial on mpi4py. I found out about it at another tutorial: https://info.gwdg.de/wiki/doku.php?id=wiki:hpc:mpi4py.
Dask implements einsum, which may be what you are after, and there is, of course matmul, if you want to write out the operation. So long as your large matrix A is a Dask array, with reasonable chunk sizes, Dask will parcel out your work without running out of memory.
If I have a vector space spanned by five vectors v1....v5, to find the orthogonal basis for A where A=[v1,v2...v5] and A is 5Xn
should I use np.linalg.qr(A) or scipy.linalg.orth(A)??
Thanks in advance
Note that sp.linalg.orth uses the SVD while np.linalg.qr uses a QR factorization. Both factorizations are obtained via wrappers for LAPACK functions.
I don't think there is a strong preference for one over the other. The SVD will be slightly more stable but also a bit slower to compute. In practice I don't think you will really see much of a difference.
You'll want to use:
scipy.linalg.orth(A)
The generally accepted rule is to use scipy.linalg - because it covers more functionality than np.linalg. Hope that helps!
Are there really good methods in Python to vectorize matrix like data constructs/containers -operations? What are the according data constructs used?
(I could observe and read that pandas and numpy element-wise operations using vectorize or applymap (may also be the case of apply/apply along axis for rows/columns) are not much of a speed progress compared to for loops.
Given that when trying to use them, you have sometimes to mess with the specificities of the datatypes when it is usually a little bit easier in for loops, what are the benefits? Readability?)
Are there ways to achieve a gap of performance similar to what happens in Matlab when comparing for loops and vectorized operations?
(note it is not to bash numpy or pandas, these are great, whole matrix operations are ok, it is just that when you have to do element-wise operations, it becomes slow).
EDIT to explain the context:
I was only wondering because I received more than once answers mentionning the fact that apply and so on are actually similar to for loops. That's why I was wondering if there were similar functions implemented in such way that it would perform better. The actual problems were varied. They just had to be element-wise, actually, not "doing the sum, product, whatever of the whole matrix". I did a lot of comparisons with differential outputs sometimes based on other matrices, so I had to use complex functions for this. But since the matrices are huge and the implementation depended on "for loop like" mechanisms, in the end I felt that my program would not work well on a more important dataset. Hence my question. But I was not looking for review, only knowledge.
You need to provide a specific example.
Normal per-element MATLAB or Python functions cannot be vectorized in general. The whole point of vectorizing, in both MATLAB and Python, is to off-load the operation onto the much faster underlying C or Fortran libraries that are designed to work on arrays of uniform data. This cannot be done on functions that operate on scalars, either in MATLAB or Python.
For functions that operate on arrays or matrices as a whole (such as mathematical operators, sum, square, etc), MATLAB and Python behave the same. In fact they use most of the same underlying C and Fortran libraries to do their calculations.
So you need to show the actual operation you want to do, and then we can see if there is a way to vectorize it.
If it is working code and you just want to improve its performance, then Code Review stack exchange site is probably a better choice.
Are there functions in python that will fill out missing values in a matrix for you, by using collaborative filtering (ex. alternating minimization algorithm, etc). Or does one need to implement such functions from scratch?
[EDIT]: Although this isn't a matrix-completion example, but just to illustrate a similar situation, I know there is an svd() function in Matlab that takes a matrix as input and automatically outputs the singular value decomposition (svd) of it. I'm looking for something like that in Python, hopefully a built-in function, but even a good library out there would be great.
Check out numpy's linalg library to find a python SVD implementation
There is a library fancyimpute. Also, sklearn NMF
I need to perform a definite integral in python (or, alternatively, in fortran), but instead of having a function I have an array of samples. I'm currently using the method given by scipy.integrate.trapz. However I'm wondering if there is an alternative method to perform such operation (for instance, the quadrature method would be a good one, but I'm afraid I can use functions but not arrays of samples). Any suggestion?
If you have only array of samples you can use numpy.trapz(y[,x]) instead of scipy.integrate.trapz