How can I represent matrices in python?
Take a look at this answer:
from numpy import matrix
from numpy import linalg
A = matrix( [[1,2,3],[11,12,13],[21,22,23]]) # Creates a matrix.
x = matrix( [[1],[2],[3]] ) # Creates a matrix (like a column vector).
y = matrix( [[1,2,3]] ) # Creates a matrix (like a row vector).
print A.T # Transpose of A.
print A*x # Matrix multiplication of A and x.
print A.I # Inverse of A.
print linalg.solve(A, x) # Solve the linear equation system.
Python doesn't have matrices. You can use a list of lists or NumPy
If you are not going to use the NumPy library, you can use the nested list. This is code to implement the dynamic nested list (2-dimensional lists).
Let r is the number of rows
let r=3
m=[]
for i in range(r):
m.append([int(x) for x in raw_input().split()])
Any time you can append a row using
m.append([int(x) for x in raw_input().split()])
Above, you have to enter the matrix row-wise. To insert a column:
for i in m:
i.append(x) # x is the value to be added in column
To print the matrix:
print m # all in single row
for i in m:
print i # each row in a different line
((1,2,3,4),
(5,6,7,8),
(9,0,1,2))
Using tuples instead of lists makes it marginally harder to change the data structure in unwanted ways.
If you are going to do extensive use of those, you are best off wrapping a true number array in a class, so you can define methods and properties on them. (Or, you could NumPy, SciPy, ... if you are going to do your processing with those libraries.)
Related
I would like to calculate the log-ratios for my 2D array, e.g.
a = np.array([[3,2,1,4], [2,1,1,6], [1,5,9,1], [7,8,2,2], [5,3,7,8]])
The formula is ln(x/g(x)), where g(x) is the geometric mean of each row. I execute it like this:
logvalues = np.array(a) # the values will be overwritten through the code below.
for i in range(len(a)):
row = np.array(a[i])
geo_mean = row.prod()**(1.0/len(row))
flr = lambda x: math.log(x/geo_mean)
logvalues = np.array([flr(x) for x in row])
I was wondering if there is any way to vectorise the above lines (preferably without introducing other modules) to make it more efficient?
This should do the trick:
geo_means = a.prod(1)**(1/a.shape[1])
logvalues = np.log(a/geo_means[:, None])
Another way you could do this is just write the function as though for a single 1-D array, ignoring the 2-D aspect:
def f(x):
return np.log(x / x.prod()**(1.0 / len(x)))
Then if you want to apply it to all rows in a 2-D array (or N-D array):
>>> np.apply_along_axis(f, 1, a)
array([[ 0.30409883, -0.10136628, -0.79451346, 0.5917809 ],
[ 0.07192052, -0.62122666, -0.62122666, 1.17053281],
[-0.95166562, 0.65777229, 1.24555895, -0.95166562],
[ 0.59299864, 0.72653003, -0.65976433, -0.65976433],
[-0.07391256, -0.58473818, 0.26255968, 0.39609107]])
Some other general notes on your attempt:
for i in range(len(a)): If you want to loop over all rows in an array it's generally faster to do simply for row in a. NumPy can optimize this case somewhat, whereas if you do for idx in range(len(a)) then for each index you have to again index the array with a[idx] which is slower. But even then it's better not to use a for loop at all where possible, which you already know.
row = np.array(a[i]): The np.array() isn't necessary. If you index an multi-dimensional array the returned value is already an array.
lambda x: math.log(x/geo_mean): Don't use math functions with NumPy arrays. Use the equivalents in the numpy module. Wrapping this in a function adds unnecessary overhead as well. Since you use this like [flr(x) for x in row] that's just equivalent to the already vectorized NumPy operations: np.log(row / geo_mean).
I need to create an array of a specific size mxn filled with empty values so that when I concatenate to that array the initial values will be overwritten with the added values.
My current code:
a = numpy.empty([2,2]) # Make empty 2x2 matrix
b = numpy.array([[1,2],[3,4]]) # Make example 2x2 matrix
myArray = numpy.concatenate((a,b)) # Combine empty and example arrays
Unfortunately, I end up making a 4x2 matrix instead of a 2x2 matrix with the values of b.
Is there anyway to make an actually empty array of a certain size so when I concatenate to it, the values of it become my added values instead of the default + added values?
Like Oniow said, concatenate does exactly what you saw.
If you want 'default values' that will differ from regular scalar elements, I would suggest you to initialize your array with NaNs (as your 'default value'). If I understand your question, you want to merge matrices so that regular scalars will override your 'default value' elements.
Anyway I suggest you to add the following:
def get_default(size_x,size_y):
# returns a new matrix filled with 'default values'
tmp = np.empty([size_x,size_y])
tmp.fill(np.nan)
return tmp
And also:
def merge(a, b):
l = lambda x, y: y if np.isnan(x) else x
l = np.vectorize(l)
return map(l, a, b)
Note that if you merge 2 matrices, and both values are non 'default' then it will take the value of the left matrix.
Using NaNs as default value, will result the expected behavior from a default value, for example all math ops will result 'default' as this value indicates that you don't really care about this index in the matrix.
If I understand your question correctly - concatenate is not what you are looking for. Concatenate does as you saw: joins along an axis.
If you are trying to have an empty matrix that becomes the values of another you could do the following:
import numpy as np
a = np.zeros([2,2])
b = np.array([[1,2],[3,4]])
my_array = a + b
--or--
import numpy as np
my_array = np.zeros([2,2]) # you can use empty here instead in this case.
my_array[0,0] = float(input('Enter first value: ')) # However you get your data to put them into the arrays.
But, I am guessing that is not what you really want as you could just use my_array = b. If you edit your question with more info I may be able to help more.
If you are worried about values adding over time to your array...
import numpy as np
a = np.zeros([2,2])
my_array = b # b is some other 2x2 matrix
''' Do stuff '''
new_b # New array has appeared
my_array = new_b # update your array to these new values. values will not add.
# Note: if you make changes to my_array those changes will carry onto new_b. To avoid this at the cost of some memory:
my_array = np.copy(new_b)
I am having a small issue understanding indexing in Numpy arrays. I think a simplified example is best to get an idea of what I am trying to do.
So first I create an array of zeros of the size I want to fill:
x = range(0,10,2)
y = range(0,10,2)
a = zeros(len(x),len(y))
so that will give me an array of zeros that will be 5X5. Now, I want to fill the array with a rather complicated function that I can't get to work with grids. My problem is that I'd like to iterate as:
for i in xrange(0,10,2):
for j in xrange(0,10,2):
.........
"do function and fill the array corresponding to (i,j)"
however, right now what I would like to be a[2,10] is a function of 2 and 10 but instead the index for a function of 2 and 10 would be a[1,4] or whatever.
Again, maybe this is elementary, I've gone over the docs and find myself at a loss.
EDIT:
In the end I vectorized as much as possible and wrote the simulation loops that I could not in Cython. Further I used Joblib to Parallelize the operation. I stored the results in a list because an array was not filling right when running in Parallel. I then used Itertools to split the list into individual results and Pandas to organize the results.
Thank you for all the help
Some tips for your to get the things done keeping a good performance:
- avoid Python `for` loops
- create a function that can deal with vectorized inputs
Example:
def f(xs, ys)
return x**2 + y**2 + x*y
where you can pass xs and ys as arrays and the operation will be done element-wise:
xs = np.random.random((100,200))
ys = np.random.random((100,200))
f(xs,ys)
You should read more about numpy broadcasting to get a better understanding about how the arrays's operations work. This will help you to design a function that can handle properly the arrays.
First, you lack some parenthesis with zeros, the first argument should be a tuple :
a = zeros((len(x),len(y)))
Then, the corresponding indices for your table are i/2 and j/2 :
for i in xrange(0,10,2):
for j in xrange(0,10,2):
# do function and fill the array corresponding to (i,j)
a[i/2, j/2] = 1
But I second Saullo Castro, you should try to vectorize your computations.
I'm using Numpy and have a 7x12x12 matrix whose values I would like to populate in 12x12 chunks, 7 different times. Suppose I have these 12x12 matrices:
first_Matrix
second_Matrix
third_Matrix
... (etc)
seventh_Matrix = first_Matrix + second_Matrix + third_Matrix...
that I'd like to add to:
grand_Matrix
How can I do this? I assume there is a better way than loops that map the coordinates from one matrix to the next, and if there's not, could someone please write out the code for mapping first_Matrix into the first 12x12 element of grand_Matrix?
grand_Matrix[0,...] = first_Matrix
grand_Matrix[1,...] = second_Matrix
and so on.
Anyway, as #Lattyware commented, it is a bad design to have extra names for so many such homogenous objects.
If you have a list of 12x12 matrices:
grand_Matrix = np.vstack(m[None,...] for m in matrices)
None adds a new dimension to each matrix and stacks them along this dimension.
I just started to learn to program in Python and I am trying to construct a sparse matrix using Scipy package. I found that there are different types of sparse matrices, but all of them require to store using three vectors like row, col, data; or if you want to each new entry separately, like S(i,j) = s_ij you need to initiate the matrix with a given size.
My question is if there is a way to store the matrix entrywise without needing the initial size, like a dictionary.
No. Any matrix in Scipy, sparse or not, must be instantiated with a size.
You can use usual dictionary with tuples of two integers as indices. For example:
matrix = {}
matrix[5, 7] = 1
matrix[3, 8] = 5
dic={}
a,b=int(input("Enter the order:")),int(input())
for i in range(a):
for j in range(b):
c=int(input())
if c!=0:
dic[(i,j)]=c
if len(dic)<=(a+b)/2:
print("sparse metrix")
else:
print("non sparse metrix")
for i in range(a):
for j in range(b):
print(dic.get((i,j),0),end=" ")
print()