In Octave or Matlab there is a neat, compact way to create large Toeplitz matrices, for example:
T = toeplitz([1,-0.25,zeros(1,20)])
That saves a lot of time that would otherwise be spent to fill the matrix with dozens or hundreds of zeros by using extra lines of code.
Yet I don't seem to be able to do the same with neither scipy or numpy, although those two libraries have both toeplitz() and zeros() functions.
Is there a similar way to do that or do I have to put together a routine of my own to do that (not a huge problem, but still a nuisance)?
Thanks,
F.
Currently I think the best you can do is:
from numpy import concatenate, zeros
from scipy.linalg import toeplitz
toeplitz(concatenate([[1., -.25], zeros(20)]))
As of python 3.5 however we'll have:
toeplitz([1., -.25, *zeros(20)])
So that's something to look forward to.
Another option is using r_:
import numpy as np
from scipy.linalg import toeplitz
toeplitz(np.r_[1, -0.25, np.zeros(20)])
You can also work with lists:
toeplitz([1, -0.25] + [0]*20)
Related
I'm doing a data analysis project where I'm working with really large numbers. I originally did everything in pure python but I'm now trying to do it with numpy and pandas. However it seems like I've hit a roadblock, since it is not possible to handle integers larger than 64 bits in numpy (if I use python ints in numpy they max out at 9223372036854775807). Do I just throw away numpy and pandas completely or is there a way to use them with python-style arbitrary large integers? I'm okay with a performance hit.
by default numpy keeps elements as number datatype.
But you can force typing to object, like below
import numpy as np
x = np.array([10,20,30,40], dtype=object)
x_exp2 = 1000**x
print(x_exp2)
the output is
[1000000000000000000000000000000
1000000000000000000000000000000000000000000000000000000000000
1000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
1000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000]
The drawback is that the execution is much slower.
Later Edit to show that np.sum() works. There could be some limitations of course.
import numpy as np
x = np.array([10,20,30,40], dtype=object)
x_exp2 = 1000**x
print(x_exp2)
print(np.sum(x_exp2))
print(np.prod(x_exp2))
and the output is:
[1000000000000000000000000000000
1000000000000000000000000000000000000000000000000000000000000
1000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
1000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000]
1000000000000000000000000000001000000000000000000000000000001000000000000000000000000000001000000000000000000000000000000
1000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
Can someone help me with this?
Have you checked numpy library? it makes working with matrices a lot easier. You can check it out in this links: https://www.tutorialspoint.com/python_data_structure/python_matrix.htm
numpy gives you a .dot method for matrix multiplication. Hope this guides you enough --> https://numpy.org/doc/stable/reference/generated/numpy.dot.html#numpy.dot
If you have a matrix like
mat=np.array([[1,2,3,4,5],
[2,3,4,5,6],
[3,4,5,6,7]])
and an array like
arr=np.array([1,2,3])
Then the required function using numpy multiplication can be so simple:
def multiplicate(mat,arr):
mat*arr.reshape((3,1))
I have a Python program that needs to pass an array to a .dll that is expecting an array of c doubles. This is currently done by the following code, which is the fastest method of conversion I could find:
from array import array
from ctypes import *
import numpy as np
python_array = np.array(some_python_array)
temp = array('d', python_array.astype('float'))
c_double_array = (c_double * len(temp)).from_buffer(temp)
...where 'np.array' is just there to show that in my case the python_array is a numpy array. Let's say I now have two c_double arrays: c_double_array_a and c_double_array_b, the issue I'm having is I would like to append c_double_array_b to c_double_array_a without reconverting to/from whatever python typically uses for arrays. Is there a way to do this with the ctypes library?
I've been reading through the docs here but nothing seems to detail combining two c_type arrays after creation. It is very important in my program that they can be combined after creation, of course it would be trivial to just append python_array_b to python_array_a and then convert but that won't work in my case.
Thanks!
P.S. if anyone knows a way to speed up the conversion code that would also be greatly appreciated, it takes on the order of 150ms / million elements in the array and my program typically handles 1-10 million elements at a time.
Leaving aside the construction of the ctypes arrays (for which Mark's comment is surely relevant), the issue is that C arrays are not resizable: you can't append or extend them. (There do exist wrappers that provide these features, which may be useful references in constructing this.) What you can do is make a new array of the size of the two existing arrays together and then ctypes.memmove them into it. It might be possible to improve the performance by using realloc, but you'd have to go even lower than normal ctypes memory management to use it.
I have a matlab code that I'm trying to translate in python.
I'm new on python but I have been able to answer a lot of questions googling a little bit.
But now, I'm trying to figure out the following:
I have a for loop when I apply different things on each column, but you don't know the number of columns. For example.
In matlab, nothing easier than this:
for n = 1:size(x,2); y(n) = mean(x(:,n)); end
But I have no idea how to do it on python when, for example, the number of columns is 1, because I can't do x[:,1] in python.
Any idea?
Thanx
Yes, if you use numpy you can use x[:,1], and also you get other data structures (vectors instead of lists), the main difference between matlab and numpy is that matlab uses matrices for calculations and numpy uses vectors, but you get used to it, I think this guide will help you out.
Try numpy. It is a python bindings for high-performance math library written in C. I believe it has the same concepts of matrix slice operations, and it is significantly faster than the same code written in pure python (in most cases).
Regarding your example, I think the closest would be something using numpy.mean.
In pure python it is hard to calculate mean of column, but it you are able to transpose the matrix you could do it using something like this:
# there are no builtin avg function
def avg(lst):
return sum(lst)/len(lst)
rows = list(avg(row) for row in a)
this is one way to do it
from numpy import *
x=matrix([[1,2,3],[2,3,4]])
[mean(x[:,n]) for n in range(shape(x)[1])]
# [1.5, 2.5, 3.5]
I wonder if it is possible to exactly reproduce the whole sequence of randn() of MATLAB with NumPy. I coded my own routine with Python/Numpy, and it is giving me a little bit different results from the MATLAB code somebody else did, and I am having hard time finding out where it is coming from because of different random draws.
I have found the numpy.random.seed value which produces the same number for the first draw, but from the second draw and on, it is completely different. I'm making multivariate normal draws for about 20,000 times so I don't want to just save the matlab draws and read it in Python.
The user asked if it was possible to reproduce the output of randn() of Matlab, not rand. I have not been able to set the algorithm or seed to reproduce the exact number for randn(), but the solution below works for me.
In Matlab: Generate your normal distributed random numbers as follows:
rng(1);
norminv(rand(1,5),0,1)
ans =
-0.2095 0.5838 -3.6849 -0.5177 -1.0504
In Python: Generate your normal distributed random numbers as follows:
import numpy as np
from scipy.stats import norm
np.random.seed(1)
norm.ppf(np.random.rand(1,5))
array([[-0.2095, 0.5838, -3.6849, -0.5177,-1.0504]])
It is quite convenient to have functions, which can reproduce equal random numbers, when moving from Matlab to Python or vice versa.
If you set the random number generator to the same seed, it will theoretically create the same numbers, ie in matlab. I am not quite sure how to best do it, but this seems to work, in matlab do:
rand('twister', 5489)
and corresponding in numy:
np.random.seed(5489)
To (re)initalize your random number generators. This gives for me the same numbers for rand() and np.random.random(), however not for randn, I am not sure if there is an easy method for that.
With newer matlab versions you can probably set up a RandStream with the same properties as numpy, for older you can reproduce numpy's randn in matlab (or vice versa). Numpy uses the polar form to create the uniform numbers from np.random.random() (the second algorithm given here: http://www.taygeta.com/random/gaussian.html). You could just write that algorithm in matlab to create the same randn numbers as numpy does from the rand function in matlab.
If you don't need a huge amount of random numbers, just save them in a .mat and read them from scipy.io though...
Just wanted to further clarify on using the twister/seeding method: MATLAB and numpy generate the same sequence using this seeding but will fill them out in matrices differently.
MATLAB fills out a matrix down columns, while python goes down rows. So in order to get the same matrices in both, you have to transpose:
MATLAB:
rand('twister', 1337);
A = rand(3,5)
A =
Columns 1 through 2
0.262024675015582 0.459316887214567
0.158683972154466 0.321000540520167
0.278126519494360 0.518392820597537
Columns 3 through 4
0.261942925565145 0.115274226683149
0.976085284877434 0.386275068634359
0.732814552690482 0.628501179539712
Column 5
0.125057926335599
0.983548605143641
0.443224868645128
python:
import numpy as np
np.random.seed(1337)
A = np.random.random((5,3))
A.T
array([[ 0.26202468, 0.45931689, 0.26194293, 0.11527423, 0.12505793],
[ 0.15868397, 0.32100054, 0.97608528, 0.38627507, 0.98354861],
[ 0.27812652, 0.51839282, 0.73281455, 0.62850118, 0.44322487]])
Note: I also placed this answer on this similar question: Comparing Matlab and Numpy code that uses random number generation