Transpose Numpy Array (Vector) - python

a = np.array([0,1,2])
b = np.array([3,4,5,6,7])
...
c = np.dot(a,b)
I want to transpose b so I can calculate the dot product of a and b.

You can use numpy's broadcasting for this:
import numpy as np
a = np.array([0,1,2])
b = np.array([3,4,5,6,7])
In [3]: a[:,None]*b
Out[3]:
array([[ 0, 0, 0, 0, 0],
[ 3, 4, 5, 6, 7],
[ 6, 8, 10, 12, 14]])
This has nothing to do with a dot product, though. But in the comments you said, that you want this result.
You could also use the numpy function outer:
In [4]: np.outer(a, b)
Out[4]:
array([[ 0, 0, 0, 0, 0],
[ 3, 4, 5, 6, 7],
[ 6, 8, 10, 12, 14]])

Well for this what you want is the outer product of the two arrays. The function you want to use for this is np.outer, :
a = np.array([0,1,2])
b = np.array([3,4,5,6,7])
np.outer(a,b)
array([[ 0, 0, 0, 0, 0],
[ 3, 4, 5, 6, 7],
[ 6, 8, 10, 12, 14]])

So with NumPy you could reshape swapping axes:
a = np.swapaxes([a], 1, 0)
# [[0]
# [1]
# [2]]
Then
print(a * b)
# [[ 0 0 0 0 0]
# [ 3 4 5 6 7]
# [ 6 8 10 12 14]]
Swapping b require to transpose the product, se here below.
Or usual NumPy reshape:
a = np.array([0,1,2])
b = np.array([3,4,5,6,7]).reshape(5,1)
print((a * b).T)
# [[ 0 0 0 0 0]
# [ 3 4 5 6 7]
# [ 6 8 10 12 14]]
Reshape is like b = np.array([ [bb] for bb in [3,4,5,6,7] ]) then b becomes:
# [[3]
# [4]
# [5]
# [6]
# [7]]
While reshaping a no need to transpose:
a = np.array([0,1,2]).reshape(3,1)
b = np.array([3,4,5,6,7])
print(a * b)
# [[ 0 0 0 0 0]
# [ 3 4 5 6 7]
# [ 6 8 10 12 14]]
Just out of curiosity, good old list comprehension:
a = [0,1,2]
b = [3,4,5,6,7]
print( [ [aa * bb for bb in b] for aa in a ] )
#=> [[0, 0, 0, 0, 0], [3, 4, 5, 6, 7], [6, 8, 10, 12, 14]]

Others have provided the outer and broadcasted solutions. Here's the dot one(s):
np.dot(a.reshape(3,1), b.reshape(1,5))
a[:,None].dot(b[None,:])
a[None].T.dot( b[None])
Conceptually I think it's a bit of an overkill, but due to implementation details, it actually is fastest
.

Related

Generate new matrix from A containing the average value of A rows for each column if B[i, j] == 1 where B is an adjacency matrix

How can we get a new matrix containing the average value of A row for each column if B[i,j] == 1 ?
Suppose we have a matrix A(3,4) and a matrix B(3,3)
A = [1 2 3 4
15 20 7 10
0 5 18 12]
And an adjacency matrix
B = [1 0 1
0 0 1
1 1 1 ]
Expected output matrix C which takes the average value of the connected pixels in B :
for example [(1+0)/2 (2+5)/2 (3+18)/2 (4+12)/2] so we get [0.5 , 3.5 10.5 8] in the first row.
C =[0.5 3.5 10.5 8
0 5 18 12
5.33 9 9.33 8.66]
To find the neighborhood of each i, I implemented the following code :
for i in range(A.shape[0]):
for j in range(A.shape[0]):
if (B[i,j] == 1):
print(j)
You can form the sums you need by matrix multiplying:
>>> A = np.array([[1, 2, 3, 4], [15, 20, 7, 10], [0, 5, 18, 12]])
>>> B = np.array([[1, 0, 1], [0, 0, 1], [1, 1, 1]])
>>> summed_groups = B#A
>>> summed_groups
array([[ 1, 7, 21, 16],
[ 0, 5, 18, 12],
[16, 27, 28, 26]])
To get the means normalize by the number of terms per group:
>>> group_sizes = B.sum(axis=1,keepdims=True)
>>> group_sizes
array([[2],
[1],
[3]])
>>> summed_groups / group_sizes
array([[ 0.5 , 3.5 , 10.5 , 8. ],
[ 0. , 5. , 18. , 12. ],
[ 5.33333333, 9. , 9.33333333, 8.66666667]])
Side note: you could also get the group sizes by matrix multiplication:
>>> group_sizes_alt = B#np.ones((len(A),1))
>>> group_sizes_alt
array([[2.],
[1.],
[3.]])
It is convenient to use boolean indexing. For example,
>>> A[[True, False, True], :]
array([[ 1, 2, 3, 4],
[ 0, 5, 18, 12]])
this selects rows 0 and 2 of the A matrix. You can loop over the columns of B and construct the C matrix:
A = np.array([[1, 2, 3, 4], [15, 20, 7, 10], [0, 5, 18, 12]])
B = np.array([[1, 0, 1], [0, 0, 1], [1, 1, 1]]).astype(bool)
C = np.array([A[B[:, i], :].mean(axis=0) for i in range(A.shape[0])])
print(np.around(C, 2))
Result:
[[ 0.5 3.5 10.5 8. ]
[ 0. 5. 18. 12. ]
[ 5.33 9. 9.33 8.67]]

numpy: convert multiple assignments to a single one using OR

taxi_modified is a two-dimensional ndarray.
Code below works, but seems un-pythonic:
taxi_modified[taxi_modified[:, 5] == 2, 15] = 1
taxi_modified[taxi_modified[:, 5] == 3, 15] = 1
taxi_modified[taxi_modified[:, 5] == 5, 15] = 1
Need to assign 1 to col at index 15 if col at index 5 is 2, 3, or 5.
The below didn't work:
taxi_modified[taxi_modified[:, 5] == 2 | 3 | 5, 15] = 1
You can use fancy indexing with np.isin (NumPy v1.13+), or np.in1d for older versions.
Here's a demo:
# example input array
A = np.arange(16).reshape((4, 4))
# calculate Boolean mask for rows
mask = np.isin(A[:, 1], [1, 5, 13])
# assign values, converting mask to integers
A[np.where(mask), 2] = -1
print(A)
array([[ 0, 1, -1, 3],
[ 4, 5, -1, 7],
[ 8, 9, 10, 11],
[12, 13, -1, 15]])
In one line, this can be written:
A[np.where(np.isin(A[:, 1], [1, 5, 13])), 2] = -1

How to select value from array that is closest to value in array using vectorization?

I have an array of values that I want to replace with from an array of choices based on which choice is linearly closest.
The catch is the size of the choices is defined at runtime.
import numpy as np
a = np.array([[0, 0, 0], [4, 4, 4], [9, 9, 9]])
choices = np.array([1, 5, 10])
If choices was static in size, I would simply use np.where
d = np.where(np.abs(a - choices[0]) > np.abs(a - choices[1]),
np.where(np.abs(a - choices[0]) > np.abs(a - choices[2]), choices[0], choices[2]),
np.where(np.abs(a - choices[1]) > np.abs(a - choices[2]), choices[1], choices[2]))
To get the output:
>>d
>>[[1, 1, 1], [5, 5, 5], [10, 10, 10]]
Is there a way to do this more dynamically while still preserving the vectorization.
Subtract choices from a, find the index of the minimum of the result, substitute.
a = np.array([[0, 0, 0], [4, 4, 4], [9, 9, 9]])
choices = np.array([1, 5, 10])
b = a[:,:,None] - choices
np.absolute(b,b)
i = np.argmin(b, axis = -1)
a = choices[i]
print a
>>>
[[ 1 1 1]
[ 5 5 5]
[10 10 10]]
a = np.array([[0, 3, 0], [4, 8, 4], [9, 1, 9]])
choices = np.array([1, 5, 10])
b = a[:,:,None] - choices
np.absolute(b,b)
i = np.argmin(b, axis = -1)
a = choices[i]
print a
>>>
[[ 1 1 1]
[ 5 10 5]
[10 1 10]]
>>>
The extra dimension was added to a so that each element of choices would be subtracted from each element of a. choices was broadcast against a in the third dimension, This link has a decent graphic. b.shape is (3,3,3). EricsBroadcastingDoc is a pretty good explanation and has a graphic 3-d example at the end.
For the second example:
>>> print b
[[[ 1 5 10]
[ 2 2 7]
[ 1 5 10]]
[[ 3 1 6]
[ 7 3 2]
[ 3 1 6]]
[[ 8 4 1]
[ 0 4 9]
[ 8 4 1]]]
>>> print i
[[0 0 0]
[1 2 1]
[2 0 2]]
>>>
The final assignment uses an Index Array or Integer Array Indexing.
In the second example, notice that there was a tie for element a[0,1] , either one or five could have been substituted.
To explain wwii's excellent answer in a little more detail:
The idea is to create a new dimension which does the job of comparing each element of a to each element in choices using numpy broadcasting. This is easily done for an arbitrary number of dimensions in a using the ellipsis syntax:
>>> b = np.abs(a[..., np.newaxis] - choices)
array([[[ 1, 5, 10],
[ 1, 5, 10],
[ 1, 5, 10]],
[[ 3, 1, 6],
[ 3, 1, 6],
[ 3, 1, 6]],
[[ 8, 4, 1],
[ 8, 4, 1],
[ 8, 4, 1]]])
Taking argmin along the axis you just created (the last axis, with label -1) gives you the desired index in choices that you want to substitute:
>>> np.argmin(b, axis=-1)
array([[0, 0, 0],
[1, 1, 1],
[2, 2, 2]])
Which finally allows you to choose those elements from choices:
>>> d = choices[np.argmin(b, axis=-1)]
>>> d
array([[ 1, 1, 1],
[ 5, 5, 5],
[10, 10, 10]])
For a non-symmetric shape:
Let's say a had shape (2, 5):
>>> a = np.arange(10).reshape((2, 5))
>>> a
array([[0, 1, 2, 3, 4],
[5, 6, 7, 8, 9]])
Then you'd get:
>>> b = np.abs(a[..., np.newaxis] - choices)
>>> b
array([[[ 1, 5, 10],
[ 0, 4, 9],
[ 1, 3, 8],
[ 2, 2, 7],
[ 3, 1, 6]],
[[ 4, 0, 5],
[ 5, 1, 4],
[ 6, 2, 3],
[ 7, 3, 2],
[ 8, 4, 1]]])
This is hard to read, but what it's saying is, b has shape:
>>> b.shape
(2, 5, 3)
The first two dimensions came from the shape of a, which is also (2, 5). The last dimension is the one you just created. To get a better idea:
>>> b[:, :, 0] # = abs(a - 1)
array([[1, 0, 1, 2, 3],
[4, 5, 6, 7, 8]])
>>> b[:, :, 1] # = abs(a - 5)
array([[5, 4, 3, 2, 1],
[0, 1, 2, 3, 4]])
>>> b[:, :, 2] # = abs(a - 10)
array([[10, 9, 8, 7, 6],
[ 5, 4, 3, 2, 1]])
Note how b[:, :, i] is the absolute difference between a and choices[i], for each i = 1, 2, 3.
Hope that helps explain this a little more clearly.
I love broadcasting and would have gone that way myself too. But, with large arrays, I would like to suggest another approach with np.searchsorted that keeps it memory efficient and thus achieves performance benefits, like so -
def searchsorted_app(a, choices):
lidx = np.searchsorted(choices, a, 'left').clip(max=choices.size-1)
ridx = (np.searchsorted(choices, a, 'right')-1).clip(min=0)
cl = np.take(choices,lidx) # Or choices[lidx]
cr = np.take(choices,ridx) # Or choices[ridx]
mask = np.abs(a - cl) > np.abs(a - cr)
cl[mask] = cr[mask]
return cl
Please note that if the elements in choices are not sorted, we need to add in the additional argument sorter with np.searchsorted.
Runtime test -
In [160]: # Setup inputs
...: a = np.random.rand(100,100)
...: choices = np.sort(np.random.rand(100))
...:
In [161]: def broadcasting_app(a, choices): # #wwii's solution
...: return choices[np.argmin(np.abs(a[:,:,None] - choices),-1)]
...:
In [162]: np.allclose(broadcasting_app(a,choices),searchsorted_app(a,choices))
Out[162]: True
In [163]: %timeit broadcasting_app(a, choices)
100 loops, best of 3: 9.3 ms per loop
In [164]: %timeit searchsorted_app(a, choices)
1000 loops, best of 3: 1.78 ms per loop
Related post : Find elements of array one nearest to elements of array two

Python NumPy: Performing different column operations over every N rows

I have a large NumPy array (OriginalArray) with many rows and 8 columns.
I want to create a new array (NewArray) in which each row has the following properties:
Columns 1, 3, 5, and 7 of NewArray are the sum over N rows of columns 1, 3, 5, and 7 of OriginalArray
Columns 2, 4, 6, and 8 of NewArray are the mean over N rows of columns 2, 4, 6, and 8 of OriginalArray
So, the NewArray has 1/N as many rows as the OriginalArray.
For example:
Original Array = [1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1 ]
with N = 2
NewArray = [2 1 2 1 2 1 2 1
2 1 2 1 2 1 2 1]
Please excuse the messy formatting. I'm still very new at this (my first question here, actually).
Thanks!
Here's a vectorized approach making heavy usage of slicing -
nrows = a.shape[0]//N # a is input array
out = np.empty((nrows,8))
out[:,::2] = a[:,::2].reshape(-1,N,4).sum(1)
out[:,1::2] = a[:,1::2].reshape(-1,N,4).mean(1)
Sample run -
In [64]: a # Input array
Out[64]:
array([[5, 1, 5, 8, 5, 0, 3, 1],
[0, 7, 8, 7, 0, 3, 5, 1],
[8, 6, 6, 4, 1, 6, 1, 2],
[4, 5, 5, 7, 5, 2, 1, 2]])
In [65]: N = 2 # Summing/averaging length
In [66]: a[:,::2] # Select [1,3,5,7] cols
Out[66]:
array([[5, 5, 5, 3],
[0, 8, 0, 5],
[8, 6, 1, 1],
[4, 5, 5, 1]])
In [67]: a[:,::2].reshape(-1,N,4).sum(1) # Sum N rows by splitting axis
Out[67]:
array([[ 5, 13, 5, 8],
[12, 11, 6, 2]])
In [68]: a[:,1::2] # Select [2,4,6,8] cols
Out[68]:
array([[1, 8, 0, 1],
[7, 7, 3, 1],
[6, 4, 6, 2],
[5, 7, 2, 2]])
In [69]: a[:,1::2].reshape(-1,N,4).mean(1) # Similarly average across N rows
Out[69]:
array([[ 4. , 7.5, 1.5, 1. ],
[ 5.5, 5.5, 4. , 2. ]])
I'm assuming that your original_array (note the PEP8 style) is already formatted in rows and columns. By this I mean, original_array = np.array([[1,1...],[1,...],[1,...],[1,...]])
An easy one-liner to create a single row of new_array would be as follows:
import numpy as np
row = [np.sum(original_array[:,x]) if x%2==1 else np.mean(test[:,x]) for x in range(len(original_array[0]))]
And then to copy the row, simply:
new_array = [row]*N

An array of matrices, built using values from an array in Python

Here is my code. What I want it to return is an array of matrices
[[1,1],[1,1]], [[2,4],[8,16]], [[3,9],[27,81]]
I know I can probably do it using for loop and looping through my vector k, but I was wondering if there is a simple way that I am missing. Thanks!
from numpy import *
import numpy as np
k=np.arange(1,4,1)
print k
def exam(p):
return np.array([[p,p**2],[p**3,p**4]])
print exam(k)
The output:
[1 2 3]
[[[ 1 2 3]
[ 1 4 9]]
[[ 1 8 27]
[ 1 16 81]]]
The key is to play with the shapes and broadcasting.
b = np.arange(1,4) # the base
e = np.arange(1,5) # the exponent
b[:,np.newaxis] ** e
=>
array([[ 1, 1, 1, 1],
[ 2, 4, 8, 16],
[ 3, 9, 27, 81]])
(b[:,None] ** e).reshape(-1,2,2)
=>
array([[[ 1, 1],
[ 1, 1]],
[[ 2, 4],
[ 8, 16]],
[[ 3, 9],
[27, 81]]])
If you must have the output as a list of matrices, do:
m = (b[:,None] ** e).reshape(-1,2,2)
[ np.mat(a) for a in m ]
=>
[matrix([[1, 1],
[1, 1]]),
matrix([[ 2, 4],
[ 8, 16]]),
matrix([[ 3, 9],
[27, 81]])]

Categories

Resources