Numpy dot product for group of rows - python

I am trying to calculate a dot product between two matrices, for each couple of rows.
I have matrix D with (u x 2) dimensions and matrix R with (u*2 x c) dimensions.
Below an example:
D = np.array([[0.02747092, 0.11233295],
[0.02747092, 0.07295284],
[0.01245856, 0.19935923],
[0.01245856, 0.13520913],
[0.11233295, 0.07295284]])
R = np.array([[-3. , 0. , 1. , -1. ],
[-1.25 , 0.75 , 1.75 , -1.25 ],
[-2.33333333, -0.33333333, 1.66666667, -1.33333333],
[-1.25 , 0.75 , 1.75 , -1.25 ],
[ 0. , -2. , 2. , -4. ],
[-1.25 , 0.75 , 1.75 , -1.25 ],
[ 0.66666667, -3.33333333, 2.66666667, -4.33333333],
[-1.25 , 0.75 , 1.75 , -1.25 ],
[-2.33333333, -0.33333333, 1.66666667, -1.33333333],
[-3. , 0. , 1. , -1. ]])
The result should be matrix M with dimensions (u x c) as follows (example of first row):
M = np.array([[-0.2185, 0.0825, 0.2195, -0.1645],
[...]])
Which is result of dot product between the first row of D and first two rows of matrix R as such:
D_ = np.array([[0.027, 0.11]])
R_ = np.array([[-3., 0., 1., -1.],
[-1.25, 0.75, 1.75, -1.25]])
D_.dot(R_)
I tried various ways of np.tensordot after reshaping the D matrix into tensor, but without any luck. I am looking for vectorized solution and to avoid loops (which is my current solution, quite slow).

Reshape R to 3D and use np.einsum -
np.einsum('ijk,ij->ik',R.reshape(len(D),2,-1),D)

Related

how to make a regular grid base on some irregular points in python

I have a numpy array of x and y coordinates and want to make it regular. The array is sorted based on its x values (first column):
import numpy as np
Irregular_points = np.array([[1.1,5.], [0.85,7.1], [0.9,9], [1.1,11], [1.,13.1],
[1.9,5.2], [2.,6.9], [1.95,9], [2.1,11.1], [2.,13.1],
[3.0,5.1], [3.1,7.0], [3.,9], [3.0,11.], [3.1,12.8]])
I want to firtly find out which points have almost the same x values: it will be the first five rows, middle five rows and last five rows. One signal for finding these points is that y value decreases when I go to the next group. Then, I want to replace the x values of each group with the average value. For example in the fisrt five rows x values are 1.1, 0.85, 0.9, 1.1 and 1. and the average is 0.98. I want to do the same for next two parts.
For y values I again want to find similar ones which fall into five groups and then replace them with average of each group. y values of the first group are 5., 5.2 and 5.1 and average is 5.1. Finally, my points should be like the following array:
Regular_points = np.array([[0.98,5.1], [0.98,7.0], [0.98,9.0], [0.98,11.03], [0.98,13.0],
[1.98,5.1], [1.98,7.0], [1.98,9.0], [1.98,11.03], [1.98,13.0],
[3.04,5.1], [3.04,7.0], [3.04,9.0], [3.04,11.03], [3.04,13.0]])
I tried to round numbers but it did not work for real cases and I need to make these averages. I very much appreciate any help. The figure clearly shows what I want. Red dots are irregular points but by replacing averages, blue dots can be resulted.
Since you're averaging rows and columns, you'll need to use a different shape. Then separate x and y coords, average them by different axis and use np.transpose + np.meshgrid for nice display:
irregular_points = np.array([[1.1,5.], [0.85,7.1], [0.9,9], [1.1,11], [1.,13.1],
[1.9,5.2], [2.,6.9], [1.95,9], [2.1,11.1], [2.,13.1],
[3.0,5.1], [3.1,7.0], [3.,9], [3.0,11.], [3.1,12.8]])
points_reshape = irregular_points.reshape(3, 5, 2)
x, y = np.transpose(points_reshape)
x_mean = x.mean(axis=0)
y_mean = y.mean(axis=1)
regular_points = np.transpose(np.meshgrid(x_mean, y_mean))
regular_points
>>>
array([[[ 0.99 , 5.1 ],
[ 0.99 , 7. ],
[ 0.99 , 9. ],
[ 0.99 , 11.03333333],
[ 0.99 , 13. ]],
[[ 1.99 , 5.1 ],
[ 1.99 , 7. ],
[ 1.99 , 9. ],
[ 1.99 , 11.03333333],
[ 1.99 , 13. ]],
[[ 3.04 , 5.1 ],
[ 3.04 , 7. ],
[ 3.04 , 9. ],
[ 3.04 , 11.03333333],
[ 3.04 , 13. ]]])
You could use a cluster algorithm like KMeans:
import numpy as np
from sklearn.cluster import KMeans
irregular_points = np.array([[1.1,5.], [0.85,7.1], [0.9,9], [1.1,11], [1.,13.1],
[1.9,5.2], [2.,6.9], [1.95,9], [2.1,11.1], [2.,13.1],
[3.0,5.1], [3.1,7.0], [3.,9], [3.0,11.], [3.1,12.8]])
kmeans_x = KMeans(n_clusters=3).fit(irregular_points[:, 0, np.newaxis])
kmeans_y = KMeans(n_clusters=5).fit(irregular_points[:, 1, np.newaxis])
clusters_x = kmeans_x.predict(irregular_points[:, 0, np.newaxis])
clusters_y = kmeans_y.predict(irregular_points[:, 1, np.newaxis])
regular_points_x = kmeans_x.cluster_centers_[clusters_x]
regular_points_y = kmeans_y.cluster_centers_[clusters_y]
regular_points = np.asarray([[regular_points_x[i], regular_points_y[i]] for i in range(irregular_points.shape[0])])
>>>
array([[[ 0.99 , 5.1 ],
[ 0.99 , 7. ],
[ 0.99 , 9. ],
[ 0.99 , 11.03333333],
[ 0.99 , 13. ]],
[[ 1.99 , 5.1 ],
[ 1.99 , 7. ],
[ 1.99 , 9. ],
[ 1.99 , 11.03333333],
[ 1.99 , 13. ]],
[[ 3.04 , 5.1 ],
[ 3.04 , 7. ],
[ 3.04 , 9. ],
[ 3.04 , 11.03333333],
[ 3.04 , 13. ]]])

Plot RGB Values with matplotlib

I have a set of RGB values in an array rgb_array of the form
[255.000, 56,026, 0.000]
[246.100, 60,000, 0.000]
...
>>> print(rbg_array)
1000, 3
that I'd like to plot similarly to the color gradient shown above.
How can I best use matpotlib's imshow to achieve this?
Supposing your array has N rows where each row contains 3 floats between 0 and 255, you can create an image as follows. First convert it to a numpy array of integers, and reshape it to (1, N, 3). This will make it a 1xN image. Then, display the image using imshow. You need to set an extent to get the x and y axes as in your example, or just set them to [0, 1, 0, 1]. Also the aspect ratio needs to be controlled, as otherwise the pixels would be considered "square".
import numpy as np
import matplotlib.pyplot as plt
rgb_array = [[255.000, 56.026 + (255 - 56.026) * i / 400, 255 * i / 400] for i in range(400)]
rgb_array += [[255 - 255 * i / 600, 255 - 255 * i / 600, 255] for i in range(600)]
img = np.array(rgb_array, dtype=int).reshape((1, len(rgb_array), 3))
plt.imshow(img, extent=[0, 16000, 0, 1], aspect='auto')
plt.show()
Don't use this method - #JohanC provides a much superior solution of creating an image rather than making a bar-graph.
I'm not so good on Matplotlib, but came up with this. There may be more efficient methods, so someone correct me please if this is the wrong approach.
#!/usr/bin/env python3
import numpy as np
import matplotlib.pyplot as plt
NSAMPLES = 100
# Synthesize R, G, B and A channels with dummy data
# The thing to note is that the samples are REAL and in range [0..1]
r = np.linspace(0,1,NSAMPLES).astype(np.float)
g = 1.0 - r
b = np.full(NSAMPLES,0.5,np.float)
a = np.full(NSAMPLES,1,np.float)
# Merge into a single array, 4 deep
RGBA = np.dstack((r,g,b,a))
# Plot
height, width = 40, 1
plt.bar(np.arange(NSAMPLES), height, width, color=rgba.reshape(-1,4))
plt.title("Some Funky Barplot")
plt.show()
The array RGBA looks like this:
array([[[0. , 1. , 0.5 , 1. ],
[0.01010101, 0.98989899, 0.5 , 1. ],
[0.02020202, 0.97979798, 0.5 , 1. ],
[0.03030303, 0.96969697, 0.5 , 1. ],
[0.04040404, 0.95959596, 0.5 , 1. ],
[0.05050505, 0.94949495, 0.5 , 1. ],
[0.06060606, 0.93939394, 0.5 , 1. ],
[0.07070707, 0.92929293, 0.5 , 1. ],
[0.08080808, 0.91919192, 0.5 , 1. ],
[0.09090909, 0.90909091, 0.5 , 1. ],
[0.1010101 , 0.8989899 , 0.5 , 1. ],
[0.11111111, 0.88888889, 0.5 , 1. ],
[0.12121212, 0.87878788, 0.5 , 1. ],
[0.13131313, 0.86868687, 0.5 , 1. ],
[0.14141414, 0.85858586, 0.5 , 1. ],
[0.15151515, 0.84848485, 0.5 , 1. ],
[0.16161616, 0.83838384, 0.5 , 1. ],
[0.17171717, 0.82828283, 0.5 , 1. ],
[0.18181818, 0.81818182, 0.5 , 1. ],
[0.19191919, 0.80808081, 0.5 , 1. ],
[0.2020202 , 0.7979798 , 0.5 , 1. ],
[0.21212121, 0.78787879, 0.5 , 1. ],
[0.22222222, 0.77777778, 0.5 , 1. ],
[0.23232323, 0.76767677, 0.5 , 1. ],
[0.24242424, 0.75757576, 0.5 , 1. ],
[0.25252525, 0.74747475, 0.5 , 1. ],
[0.26262626, 0.73737374, 0.5 , 1. ],
[0.27272727, 0.72727273, 0.5 , 1. ],
[0.28282828, 0.71717172, 0.5 , 1. ],
[0.29292929, 0.70707071, 0.5 , 1. ],
[0.3030303 , 0.6969697 , 0.5 , 1. ],
[0.31313131, 0.68686869, 0.5 , 1. ],
[0.32323232, 0.67676768, 0.5 , 1. ],
[0.33333333, 0.66666667, 0.5 , 1. ],
[0.34343434, 0.65656566, 0.5 , 1. ],
[0.35353535, 0.64646465, 0.5 , 1. ],
[0.36363636, 0.63636364, 0.5 , 1. ],
[0.37373737, 0.62626263, 0.5 , 1. ],
[0.38383838, 0.61616162, 0.5 , 1. ],
[0.39393939, 0.60606061, 0.5 , 1. ],
[0.4040404 , 0.5959596 , 0.5 , 1. ],
[0.41414141, 0.58585859, 0.5 , 1. ],
[0.42424242, 0.57575758, 0.5 , 1. ],
[0.43434343, 0.56565657, 0.5 , 1. ],
[0.44444444, 0.55555556, 0.5 , 1. ],
[0.45454545, 0.54545455, 0.5 , 1. ],
[0.46464646, 0.53535354, 0.5 , 1. ],
[0.47474747, 0.52525253, 0.5 , 1. ],
[0.48484848, 0.51515152, 0.5 , 1. ],
[0.49494949, 0.50505051, 0.5 , 1. ],
[0.50505051, 0.49494949, 0.5 , 1. ],
[0.51515152, 0.48484848, 0.5 , 1. ],
[0.52525253, 0.47474747, 0.5 , 1. ],
[0.53535354, 0.46464646, 0.5 , 1. ],
[0.54545455, 0.45454545, 0.5 , 1. ],
[0.55555556, 0.44444444, 0.5 , 1. ],
[0.56565657, 0.43434343, 0.5 , 1. ],
[0.57575758, 0.42424242, 0.5 , 1. ],
[0.58585859, 0.41414141, 0.5 , 1. ],
[0.5959596 , 0.4040404 , 0.5 , 1. ],
[0.60606061, 0.39393939, 0.5 , 1. ],
[0.61616162, 0.38383838, 0.5 , 1. ],
[0.62626263, 0.37373737, 0.5 , 1. ],
[0.63636364, 0.36363636, 0.5 , 1. ],
[0.64646465, 0.35353535, 0.5 , 1. ],
[0.65656566, 0.34343434, 0.5 , 1. ],
[0.66666667, 0.33333333, 0.5 , 1. ],
[0.67676768, 0.32323232, 0.5 , 1. ],
[0.68686869, 0.31313131, 0.5 , 1. ],
[0.6969697 , 0.3030303 , 0.5 , 1. ],
[0.70707071, 0.29292929, 0.5 , 1. ],
[0.71717172, 0.28282828, 0.5 , 1. ],
[0.72727273, 0.27272727, 0.5 , 1. ],
[0.73737374, 0.26262626, 0.5 , 1. ],
[0.74747475, 0.25252525, 0.5 , 1. ],
[0.75757576, 0.24242424, 0.5 , 1. ],
[0.76767677, 0.23232323, 0.5 , 1. ],
[0.77777778, 0.22222222, 0.5 , 1. ],
[0.78787879, 0.21212121, 0.5 , 1. ],
[0.7979798 , 0.2020202 , 0.5 , 1. ],
[0.80808081, 0.19191919, 0.5 , 1. ],
[0.81818182, 0.18181818, 0.5 , 1. ],
[0.82828283, 0.17171717, 0.5 , 1. ],
[0.83838384, 0.16161616, 0.5 , 1. ],
[0.84848485, 0.15151515, 0.5 , 1. ],
[0.85858586, 0.14141414, 0.5 , 1. ],
[0.86868687, 0.13131313, 0.5 , 1. ],
[0.87878788, 0.12121212, 0.5 , 1. ],
[0.88888889, 0.11111111, 0.5 , 1. ],
[0.8989899 , 0.1010101 , 0.5 , 1. ],
[0.90909091, 0.09090909, 0.5 , 1. ],
[0.91919192, 0.08080808, 0.5 , 1. ],
[0.92929293, 0.07070707, 0.5 , 1. ],
[0.93939394, 0.06060606, 0.5 , 1. ],
[0.94949495, 0.05050505, 0.5 , 1. ],
[0.95959596, 0.04040404, 0.5 , 1. ],
[0.96969697, 0.03030303, 0.5 , 1. ],
[0.97979798, 0.02020202, 0.5 , 1. ],
[0.98989899, 0.01010101, 0.5 , 1. ],
[1. , 0. , 0.5 , 1. ]]])

In-line column assignments in Python/Numpy

I have a bunch of points and need to select a subset of them, add a value to the x coordinates and store the information in the original points.
I need to do it without loops or intermediate assignments.
import numpy as np
points=np.array([[100. , 100. , 100. ],
[ 0. , -2.75, 0. ],
[ 0. , -2.75, 5. ],
[ 0. , -1.9 , 3.15],
[ 0. , -1.9 , 3.35]])
then trying:
points[[3,4,0]][:,[0]]+=2
or
points[[3,4,0]][:,[0]]=points[[3,4,0]][:,[0]]+2
the original points variable does not change.
Any ideas? I suspect I am missing some stupid stuff...
If you are looking to edit first column of those rows use:
points[[3,4,0], 0] += 2
points
#[[ 102. 100. 100. ]
# [ 0. -2.75 0. ]
# [ 0. -2.75 5. ]
# [ 2. -1.9 3.15]
# [ 2. -1.9 3.35]]

How can I extract two separate matrices from a file?

So I have a file that looks something like this:
# 3 # Number of network ROIs
# 2 # Number of netcc matrices
# WITH_ROI_LABELS
001 002 003
1 2 3
# CC
1.0000 0.9800 0.9895
0.9800 1.0000 0.9817
0.9895 0.9817 1.0000
# FZ
4.0000 2.2965 2.6240
2.2965 4.0000 2.3426
2.6240 2.3426 4.0000
I want to extract the 3x3 matrix labelled "CC"
I want to extract the 3x3 matrix labelled "FZ"
So I did the following:
file=/users/3dfile1
A= numpy.genfromtxt(file)
m= A[:,:]
m
So the output I get looks like this:
array([[ 1. , 2. , 3. ],
[ 1. , 2. , 3. ],
[ 1. , 0.98 , 0.9895],
[ 0.98 , 1. , 0.9817],
[ 0.9895, 0.9817, 1. ],
[ 4. , 2.2965, 2.624 ],
[ 2.2965, 4. , 2.3426],
[ 2.624 , 2.3426, 4. ]])
However, my question is... if I have multiple files. Where the matrix size is NOT CONSISTENT. This means that in some files the matrix will be 3x3, some files 8x8, 1x1, etc. In this case, how can I code something that will:
differentiate the matrix CC from FZ
extract the matrix (can detect the size of matrix somehow and give me the exact matrix I'm looking for)
Try
import numpy as np
x = np.array([[ 1. , 2. , 3. ],
[ 1. , 2. , 3. ],
[ 1. , 0.98 , 0.9895],
[ 0.98 , 1. , 0.9817],
[ 0.9895, 0.9817, 1. ],
[ 4. , 2.2965, 2.624 ],
[ 2.2965, 4. , 2.3426],
[ 2.624 , 2.3426, 4. ]])
x1 = x[2:,:]
x2 = x1.reshape(2,3,3)
CC ,FZ = x2
Result:
In [23]: CC
Out[23]:
array([[ 1. , 0.98 , 0.9895],
[ 0.98 , 1. , 0.9817],
[ 0.9895, 0.9817, 1. ]])
In [24]: FZ
Out[24]:
array([[ 4. , 2.2965, 2.624 ],
[ 2.2965, 4. , 2.3426],
[ 2.624 , 2.3426, 4. ]])

Fastest way to compute upper-triangular matrix of geometric series (Python)

and thanks in advance for the help.
Using Python (mostly numpy), I am trying to compute an upper-triangular matrix where each row "j" is the first j-terms of a geometric series, all rows using the same parameter.
For example, if my parameter is B (where abs(B)=<1, i.e. B in [-1,1]), then row 1 would be [1 B B^2 B^3 ... B^(N-1)], row 2 would be [0 1 B B^2...B^(N-2)] ... row N would be [0 0 0 ... 1].
This computation is key to a Bayesian Metropolis-Gibbs sampler, and so needs to be done thousands of times for new values of "B".
I have currently tried this two ways:
Method 1 - Mostly Vectorized:
B_Matrix = np.triu(np.dot(np.reshape(B**(-1*np.array(range(N))),(N,1)),np.reshape(B**(np.array(range(N))),(1,N))))
Essentially, this is the upper triangle part of a product of an Nx1 and 1xN set of matrices:
upper triangle ([1 B^(-1) B^(-2) ... B^(-(N-1))]' * [1 B B^2 B^3 ... B^(N-1)])
This works great for small N (algebraically it is correct), but for large N it errs out. And it produces errors out for B=0 (which should be allowed). I believe this is stemming from taking B^(-N) ~ inf for small B and large N.
Method 2:
B_Matrix = np.zeros((N,N))
B_Row_1 = B**(np.array(range(N)))
for n in range(N):
B_Matrix[n,n:] = B_Row_1[0:N-n]
So that just fills in the matrix row by row, but uses a loop which slows things down.
I was wondering if anyone had run into this before, or had any better ideas on how to compute this matrix in a faster way.
I've never posted on stackoverflow before, but didn't see this question anywhere, and thought I'd ask.
Let me know if there's a better place to ask this, and if I should provide anymore detail.
You could use scipy.linalg.toeplitz:
In [12]: n = 5
In [13]: b = 0.5
In [14]: toeplitz(b**np.arange(n), np.zeros(n)).T
Out[14]:
array([[ 1. , 0.5 , 0.25 , 0.125 , 0.0625],
[ 0. , 1. , 0.5 , 0.25 , 0.125 ],
[ 0. , 0. , 1. , 0.5 , 0.25 ],
[ 0. , 0. , 0. , 1. , 0.5 ],
[ 0. , 0. , 0. , 0. , 1. ]])
If your use of the array is strictly "read only", you can play tricks with numpy strides to quickly create an array that uses only 2*n-1 elements (instead of n^2):
In [55]: from numpy.lib.stride_tricks import as_strided
In [56]: def make_array(b, n):
....: vals = np.zeros(2*n - 1)
....: vals[n-1:] = b**np.arange(n)
....: a = as_strided(vals[n-1:], shape=(n, n), strides=(-vals.strides[0], vals.strides[0]))
....: return a
....:
In [57]: make_array(0.5, 4)
Out[57]:
array([[ 1. , 0.5 , 0.25 , 0.125],
[ 0. , 1. , 0.5 , 0.25 ],
[ 0. , 0. , 1. , 0.5 ],
[ 0. , 0. , 0. , 1. ]])
If you will modify the array in-place, make a copy of the result returned by make_array(b, n). That is, arr = make_array(b, n).copy().
The function make_array2 incorporates the suggestion #Jaime made in the comments:
In [30]: def make_array2(b, n):
....: vals = np.zeros(2*n-1)
....: vals[n-1] = 1
....: vals[n:] = b
....: np.cumproduct(vals[n:], out=vals[n:])
....: a = as_strided(vals[n-1:], shape=(n, n), strides=(-vals.strides[0], vals.strides[0]))
....: return a
....:
In [31]: make_array2(0.5, 4)
Out[31]:
array([[ 1. , 0.5 , 0.25 , 0.125],
[ 0. , 1. , 0.5 , 0.25 ],
[ 0. , 0. , 1. , 0.5 ],
[ 0. , 0. , 0. , 1. ]])
make_array2 is more than twice as fast as make_array:
In [35]: %timeit make_array(0.99, 600)
10000 loops, best of 3: 23.4 µs per loop
In [36]: %timeit make_array2(0.99, 600)
100000 loops, best of 3: 10.7 µs per loop

Categories

Resources