How do I sum all the columns in 2D matrix in Python? - python

I used the following code to sum all the rows in a 2D matrix but I want to sum all the columns instead:
row_sum = sum(map(sum,[arr]))

You can try the code below:
import numpy as np
arr: <2D matrix>
col_sum = np.sum(arr, axis=1, keepdims=True)

Related

Correlation of columns of two arrays in python

I have two arrays: 900x421 and 900x147. I need to correlate all columns from these arrays so that the output is 421x147. In Matlab function corr() does it, but I can't find a function that does the same in python.
the numpy.corrcoef function is the way to go. You need both arguments x and y to be of the same shape. You can do so by concatenate the two arrays. Let's say arr1 is of shape 900x421 and arr2 is of shape 900x147. You can do the following
import numpy as np
two_arrays = np.concatenate((arr1, arr2), axis=1) # 900x568
corr = np.corrcoef(two_arrays.T) # 568x568 array
desired_output = corr[0:421, 421:]
The np.corrcoef treats each row as a variable and each column as observation. That is why we need to transpose the array.

How to concatenate more sparse matrices into one in python

I have a problem in python where i would like to merge some sparse matrices into one. The sparse matrices are of csr_matrix type and have same amount of rows. When I use hstack to stack them together I obtain an array of matrices, but I would like to obtain a single matrix with the number of rows (which is the same for every matrix) and as the number of columns the sum of the columns number of every matrix.
Thanks for support.
You can do this using scipy.sparse.hstack. For example:
import numpy as np
from scipy import sparse
x = sparse.csr_matrix(np.random.randint(0, 2, size=(10, 10)))
y = sparse.csr_matrix(np.random.randint(0, 2, size=(10, 10)))
xy = sparse.hstack([x, y])
print(xy.shape)
# (10, 20)
print(type(xy))
# <class 'scipy.sparse.coo.coo_matrix'>

How to sort each row of a 3D numpy array by another 2D array?

I have a 2D numpy array of 2D points:
np.random.seed(0)
a = np.random.rand(3, 4, 2) # each value is a 2D point
I would like to sort each row by the norm of every point
norms = np.linalg.norm(a, axis=2) # shape(3, 4)
indices = np.argsort(norms, axis=0) # indices of each sorted row
Now I would like to create an array with the same shape and values as a. that will have each row of 2D points sorted by their norm.
How can I achieve that?
I tried variations of np.take & np.take_along_axis but with no success.
for example:
np.take(a, indices, axis=1) # shape (3,3,4,2)
This samples a 3 times, once for each row in indices.
I would like to sample a just once. each row in indices has the columns that should be sampled from the corresponding row.
If I understand you correctly, you want this:
norms = np.linalg.norm(a,axis=2) # shape(3,4)
indices = np.argsort(norms , axis=1)
np.take_along_axis(a, indices[:,:,None], axis=1)
output for your example:
[[[0.4236548 0.64589411]
[0.60276338 0.54488318]
[0.5488135 0.71518937]
[0.43758721 0.891773 ]]
[[0.07103606 0.0871293 ]
[0.79172504 0.52889492]
[0.96366276 0.38344152]
[0.56804456 0.92559664]]
[[0.0202184 0.83261985]
[0.46147936 0.78052918]
[0.77815675 0.87001215]
[0.97861834 0.79915856]]]

How to reshape an array of arrays in Python using Numpy

As you can see below I have created three arrays that contain different random numbers:
np.random.seed(200)
Array1 = np.random.randn(300)
Array2 = Array1 + np.random.randn(300) * 2
Array3 = Array1 + np.random.randn(300) * 2
data = np.array([Array1, Array2 , Array3])
#data.reshape(data, (Array3, Array1)
mydf = pd.DataFrame(data)
mydf.tail()
My objective is to build a DataFrame with those three arrays. Each array should show its values in a different column. The DataFrame should have three columns and the index. My problem with the above code is that the Dataframe is built in horizontal position instead of vertical position. The DataFrame looks like this:
I have tried to use the reshape function to reshape the numpy array called ”data” but I couldn’t make it work. Any help would be more than welcome. Thanks!
You can use .T to transpose either the data data = np.array([Array1, Array2 , Array3]).T or the dataframe mydf = pd.DataFrame(data).T.
Output:
0 1 2
295 -0.126758 1.697413 0.399351
296 0.548405 1.402154 -4.396156
297 -1.063243 0.279774 -0.636649
298 -0.678952 -2.061554 0.244339
299 -0.527970 -0.290680 -0.930381
Or build a 2D array right away
arr = np.random.randn(300, 3)
arr[:, 1:] *= 2
mydf = pd.DataFrame(arr)

Concatenate columns while maintaining rows

I have a numpy array that I would like to concatenate the columns into a single value for the row. Below is what I have tried so far.
import numpy as np
randoma=np.random.choice(list('ACTG'),(5,21),replace=True)# create a 7x21 raqndom matrix with A,C,T,G
randoma=np.concatenate(randoma, axis=None)
expected results is something like
randoma = ['AAGCCGCACACAGACCCTGAG',
'AAGCTGCACGCAGACCCTGAG',
'AGGCTGCACGCAGACCCTGAG',
'AAGCTGCACGTGGACCCTGAG',
'AGGCTGCACGTGGACCCTGAG',
'AGGCTGCACGTGGACCCTGAG',
'AAGCTGCATGTGGACCCTGAG']
import numpy as np
randoma = np.random.choice(list('ACTG'),(5,21),replace=True) # create a 7x21 raqndom matrix with A,C,T,G
new_list = [''.join(x) for x in randoma.tolist()]
new_list
['CGGGACGCACTTCCTGTGCAG',
'TGTAGCGGCTTGGTGTCCAAG',
'GAAAGTTTAGGATTGCGTCGG',
'AGTATTGTGATTCTATCTGAC',
'TTAGTAAGAGTGTCTCACTAT']

Categories

Resources