Multiplication for a 3d array and slicing - python

I have a matrix of size 5 x 98 x 3. I want to find the transpose of each block of 98 x 3 and multiply it with itself to find the standard deviation.
Hence, I want my final answer to be of the size 5 x 3 x 3.
What would be an efficient way of doing this using numpy.
I can currently do this using the following code:
MU.shape[0] = 5
rows = 98
SIGMA = []
for i in np.arange(MU.shape[0]):
SIGMA.append([])
SIGMA[i] = np.matmul(np.transpose(diff[i]),diff[i])
SIGMA = np.array(SIGMA)
SIGMA = SIGMA/rows
Here diff is of the size 5 x 98 x 3.

Use np.einsum to sum-reduce the last axes off against each other -
SIGMA = np.einsum('ijk,ijl->ikl',diff,diff)
SIGMA = SIGMA/rows
Use optimize flag with True value in np.einsum to leverage BLAS.
We can also use np.matmul to get those sum-reductions -
SIGMA = np.matmul(diff.swapaxes(1,2),diff)

You can use this:
my_result = arr1.swapaxes(1,2) # arr1
Testing it out:
import numpy as np
NINETY_EIGHT = 10
arr1 = np.arange(5*NINETY_EIGHT*3).reshape(5,NINETY_EIGHT,3)
my_result = arr1.swapaxes(1,2) # arr1
print (my_result.shape)
Output:
(5, 3, 3)

Related

How to efficiently apply function over each row of ndarray with value from list of args?

I would like to apply function func over each row of 2D ndarray arr shaped n x m with provided list of arguments args (of lengh n). That is for each row i function is executed as func(arr[i, :], args[i]).
This task can be acomplished with np.fromiter (using for loop):
iterable = (func(row, arg) for row, arg in zip(arr, args))
results = np.fromiter(iterable, dtype=int)
However this can take some time in case of large arrays. Acoording to unutbu's answer using numpy's python utility functions (e.g. np.apply_along_axis) does not provide siginifacnt speedup. Is there a way to optimize this process?
To avoid falling into XY problem trap, beneath is my orginal problem statement:
I have an ndarray representing image, shaped n x m. This image undergo processing during, which for each row a specifix index i is calculated. I want to compose a image of orginal shape (n x m) using data on the right from index i for each row. That is I want to resample each row[i:] of length m - i to m samples. Note that I want to use my own implementation of resampling function (don't want to use scipy.signal.resample etc).
EDIT:
Test code with func example (added count argument to fromiter as suggested by LudvigH):
import numpy as np
import matplotlib.pyplot as plt
def simple_slant_range_correction(
row, height, n_samples, max_ground_range, max_slant_range, slant_range_resolution
):
ground_ranges = np.linspace(height, max_ground_range, n_samples)
slant_ranges = np.sqrt(ground_ranges ** 2 + height ** 2)
slant_ranges_indicies = slant_ranges / slant_range_resolution - 1
slant_ranges_indicies_floor = np.floor(slant_ranges_indicies).astype(np.int16)
slant_ranges_indicies_ceil = np.clip(
0, n_samples - 1, slant_ranges_indicies_floor + 1
)
weight = slant_ranges_indicies - slant_ranges_indicies_floor
return (
weight * row[slant_ranges_indicies_ceil]
+ (1 - weight) * row[slant_ranges_indicies_floor]
).astype(np.float32)
if __name__ == "__main__":
# Test parameters
n, m = 100, 100
max_slant_range = 50
slant_range_resolution = max_slant_range / m
# Create some dummy data
data = np.zeros((n, m))
h_indicies = np.ones((n), dtype=int)
for i in np.arange(0, n, 5):
data[:i, :i] += i
h_indicies[:i] += 1
heights = h_indicies * slant_range_resolution
max_ground_ranges = np.sqrt(max_slant_range ** 2 - heights ** 2)
# Perform resampling based on h_index
iters = (
simple_slant_range_correction(
row, height, m, max_ground_range, max_slant_range, slant_range_resolution
)
for row, height, max_ground_range in zip(data, heights, max_ground_ranges)
)
data_sampled = np.fromiter(iters, dtype=np.dtype((np.float32, m)), count=n)
# Plot data
fig, axs = plt.subplots(1, 2)
axs[0].plot(h_indicies + 0.5, np.arange(n) + 0.5, c="red")
axs[0].imshow(data, vmin=0, vmax=data.max())
axs[1].imshow(data_sampled, vmin=0, vmax=data.max())
axs[0].set_axis_off()
axs[1].set_axis_off()
plt.tight_layout()
plt.show()
It is typically faster to take advantage of vectorization by using numpy operations to manipulate the data, as compared to using python functions and objects to manipulate the data. Below is an example of a way to solve the problem described at the end of your question using numpy vectorization.
import numpy as np
Choosing some array and column indices as an example:
# 1 2 3 3 1
# A = 4 5 6 6 row_indices = 3
# 7 8 9 9 2
A = np.array([[1,2,3,3],[4,5,6,6],[7,8,9,9]])
row_indices = np.array([1,3,2])
Use vector operations to build a boolean masking array and then multiply the original array by the mask:
NM = np.shape(A)
N = NM[0]
M = NM[1]
col = np.arange(M,dtype=np.uint32)
B = np.outer(np.ones([1,N],dtype=np.uint32),col)
C = np.outer(row_indices,np.ones([1,M],dtype=np.uint32))
A_sampled = (B>=C)*A
print(A_sampled)
# output:
# 0 2 3 3
# 0 0 0 6
# 0 0 9 9

How to use Numpy to multiply all elements of matrix x by all elements of matrix y?

I am using Numpy as part of a neural network, and when updating the weights I am struggling to implement a step in a natural way.
The step works for an input rho_deltas (shape: (m,)) and self.node_layers[i-1].val (shape: (n,)) and outputs self.previous_edge_layer[i - 1] (shape: (m,n))
It should be such that self.previous_edge_layer[i - 1][j][k] == rho_deltas[j] * self.node_layers[i - 1].vals[k]
Example working inputs and outputs here.
(I'll try to update these so it is easier to copy and paste for testing your methods.)
I have managed to get it working well like:
self.previous_edge_layer[i - 1] = np.array([rho_delta * self.node_layers[i - 1].vals for rho_delta in rho_deltas])
However, it feels to me as though there is a Numpy operator/function that should be able to do this without the iteration over the full list. My inclination is matrix multiplication (#) however, I have not been able to get this to work. Or perhaps, dot product (*), however for n != m this fails.
Furthermore, I struggled to come up with a useful name for this question so feel free to rename so something better :).
Matrix multiplication is the right idea: the preliminary is to form matrices from your 1D vectors. We need 2D matrices here, even though one dimension will be of size 1. Something like this:
import numpy as np
rho_deltas = np.array([7.6, 12.3, 11.1]) # example data with m = 3
layer_vals = np.array([1.5, 20.9, -3.5, 7.0]) # example with n = 4
rho_deltas_row_mat = rho_deltas.reshape(-1, 1) # m rows, 1 column
layer_vals_col_mat = layer_vals.reshape(1, -1) # 1 row, n columns
res = rho_deltas_row_mat # layer_vals_col_mat
print(res.shape)
print(all(res[j][k] == rho_deltas[j] * layer_vals[k] for j in range(rho_deltas.shape[0]) for k in range(layer_vals.shape[0])))
prints:
(3, 4)
True
Alternatively, you could reshape both of them to row matrices and use transposition, something like:
rho_deltas_row_mat = rho_deltas.reshape(-1, 1)
layer_vals_row_mat = layer_vals.reshape(-1, 1)
res = rho_deltas_row_mat # layer_vals_row_mat.T
Based on the link you provided you can use Numpy meshgrid function to repeat two array based on each others dimension and then simply multiply them element wise.
The following will do what you want (tested it on your example and produced same results)
import numpy as np
a = np.array([1,2,3])
b = np.array([10,20,30,40,50])
bv, av = np.meshgrid(b,a) # returns repetition of one arraya by the other one's dimension.
# av = [[1 1 1 1 1]
# [2 2 2 2 2]
# [3 3 3 3 3]]
# bv = [[10 20 30 40 50]
# [10 20 30 40 50]
# [10 20 30 40 50]]
c = av*bv
# c = [[ 10 20 30 40 50]
# [ 20 40 60 80 100]
# [ 30 60 90 120 150]]
Similar result can also be achieved with Numpy einsum function if you are familiar with einstein sum and notation.

repmat in python3 for a 100 by 3 matrix without dimension error. Matlab repmat worked fine but not pytthon3

I have a "100 by 3" matrix called "z". Want to get this matrix repeated for 15 times and do this (x1-w1)**2 + (H-z1)**2) in python3.
See example below. In MATLAB it worked fine with z1=repmat(z,n,1) where n=15. Now how to fix this "z1 repmat" in python3 so that there is no dimension error for (x1-w1)**2 + (H-z1)**2) ?
# some 100 by 3 matrix
z = np.random.rand(100, 3)
H = 10
x=np.transpose(np.linspace(0,100,15)).reshape(15,1)
w=np.linspace(0,100,100).reshape(100,1)
x1=np.matlib.repmat(x,1,100).T
w1= np.matlib.repmat(w,1,15)
z1=repmat(z,n,1) # where n=15
result = (x1-w1)**2 + (H-z1)**2
I guess you want your z1 to be equal to np.matlib.repmat(z, 1, n) instead of (z, n, 1).
Meaning, to match the shape of the other operand ((x1-w1)**2 of shape (100,15)), for z1 you should be repeating columns of z, not rows. Also, here I assume that your n is equal to 5, since this is the value that fits the example provided.
import numpy as np
import numpy.matlib
# some 100 by 3 matrix
z = np.random.rand(100, 3)
H = 10
x=np.transpose(np.linspace(0,100,15)).reshape(15,1)
w=np.linspace(0,100,100).reshape(100,1)
x1=np.matlib.repmat(x,1,100).T
w1= np.matlib.repmat(w,1,15)
z1=np.matlib.repmat(z,1,5)
print((x1-w1)**2 + (H-z1)**2)

Normalization of a vector using loops in Python

Write a function that normalizes a vector (finds the unit vector). A vector can be normalized by dividing each individual component of the vector by its magnitude. Your input for this function will be a vector i.e. 1 dimensional list containing 3 integers.
According to the solution devised, I have considered a predefined list of 3 elements. But if I want to apply loops, then please explain me that how I could deduce the solution using loops. I tried working on the problem. This is my solution so far:
from math import sqrt
def vector_normalization(my_vector):
result = 0
for x in my_vector:
result = result + (x ** 2)
magnitude = sqrt(result)
nx_vector = my_vector[0] / magnitude
ny_vector = my_vector[1] / magnitude
nz_vector = my_vector[2] / magnitude
n_vector = [nx_vector, ny_vector, nz_vector]
return n_vector
Now, after I calculate the magnitude using for loop of some random list, according to my program I will get only three elements in the list as the output. But I want all the elements in the random list to be normalized. Please suggest me the way to achieve the same.
Also, you can use high order functions in Python like map:
vec = [1,2,3]
magnitude = sqrt(sum(map(lambda x: x**2, vec)))
normalized_vec = list(map(lambda x: x/magnitude, vec))
So normalized_vec becomes:
[0.2672612419124244, 0.5345224838248488, 0.8017837257372732]
Or using Numpy:
import numpy as np
arr = np.array([1,2,3])
arr_normalized = arr/sqrt(sum(arr**2))
arr_normalized results in:
array([ 0.26726124, 0.53452248, 0.80178373])
Please try the following code,
vector = [1,2,4]
y=0
for x in vector:
y+=x**2
y = y**0.5
unit_vector = []
for x in vector:
unit_vector.append(x/y)
Hope this helps.
def vector_normalization(vec):
result = 0
for x in vec:
result = result + (x**2)
magnitude = (result)**0.5
x = vec[0]/magnitude
y = vec[1]/magnitude
z = vec[2]/magnitude
vec = [x,y,z]
return vec

Python two arrays, get all points within radius

I have two arrays, lets say x and y that contain a few thousand datapoints.
Plotting a scatterplot gives a beautiful representation of them. Now I'd like to select all points within a certain radius. For example r=10
I tried this, but it does not work, as it's not a grid.
x = [1,2,4,5,7,8,....]
y = [-1,4,8,-1,11,17,....]
RAdeccircle = x**2+y**2
r = 10
regstars = np.where(RAdeccircle < r**2)
This is not the same as an nxn array, and RAdeccircle = x**2+y**2 does not seem to work as it does not try all permutations.
You can only perform ** on a numpy array, But in your case you are using lists, and using ** on a list returns an error,so you first need to convert the list to numpy array using np.array()
import numpy as np
x = np.array([1,2,4,5,7,8])
y = np.array([-1,4,8,-1,11,17])
RAdeccircle = x**2+y**2
print RAdeccircle
r = 10
regstars = np.where(RAdeccircle < r**2)
print regstars
>>> [ 2 20 80 26 170 353]
>>> (array([0, 1, 2, 3], dtype=int64),)

Categories

Resources