I'm currently trying to append multiple Numpy arrays together. Basically, what I want to do is to start from a (1 x m) matrix (technically a vector), and end up with a (n x m) matrix. So going from n (1 x m) matrices (vectors) to one (n x m) matrix (If that makes any sense). The ultimate goal with this is to write the matrix into a csv-file with the numpy.savetxt() function so I'll end up with a csv-file with n columns of m length.
The problem with this is that numpy.append() appends the vectors together into a (1 x 2m) vector. So let's say a1 and a2 are Numpy arrays with 10000 elements each. I'll append a2 into a1 by using the append function and simultaneously creating a new array called a, which contains both a1 and a2.
a=np.append(a1, a2, axis=0)
a.shape
>>(20000,)
What I want instead is for the shape to be of the form
>>(2, 10000)
or more generally
>>(n, m)
What should I do? Please note, that I want to continue adding the vectors into the array. Thanks for your time!
you can use the transpose of numpy.column_stack
For example:
import numpy as np
a=np.array([1,2,3,4,5])
b=np.array([9,8,7,6,5])
c=np.column_stack((a,b)).T
print c
>>> array([[1, 2, 3, 4, 5],
[9, 8, 7, 6, 5]])
print a.shape,b.shape,c.shape
>>> (5,) (5,) (2, 5)
EDIT:
you can keep adding columns like so:
d=np.array([2,2,2,2,2])
c=np.column_stack((c.T,d)).T
print c
>>> array([[1, 2, 3, 4, 5],
[9, 8, 7, 6, 5],
[2, 2, 2, 2, 2]])
print c.shape
>>> (3, 5)
This should work
a=np.append(a1, a2, axis=0).reshape(2,10000)
a.shape
>>(2,10000)
In order to merge arrays vertically I would use np.vstack
import numpy as np
np.vstack((a1,a2))
However, from my point of view, numpy.array shouldn't be created using for loops and appending the new array to the old one. Instead, either you create first the whole numpy.array (nxm) and you write the data from the for loop into that array,
data = np.zeros((n,m))
for i in range(n):
data[i] = ...
or you first create your array as an ordinary python list using append which you can transform at the end into an numpy.array.
data = []
for i in range(n):
data.append(...)
data = np.asarray(data)
Related
I use Python and NumPy and have some problems with "transpose":
import numpy as np
a = np.array([5,4])
print(a)
print(a.T)
Invoking a.T is not transposing the array. If a is for example [[],[]] then it transposes correctly, but I need the transpose of [...,...,...].
It's working exactly as it's supposed to. The transpose of a 1D array is still a 1D array! (If you're used to matlab, it fundamentally doesn't have a concept of a 1D array. Matlab's "1D" arrays are 2D.)
If you want to turn your 1D vector into a 2D array and then transpose it, just slice it with np.newaxis (or None, they're the same, newaxis is just more readable).
import numpy as np
a = np.array([5,4])[np.newaxis]
print(a)
print(a.T)
Generally speaking though, you don't ever need to worry about this. Adding the extra dimension is usually not what you want, if you're just doing it out of habit. Numpy will automatically broadcast a 1D array when doing various calculations. There's usually no need to distinguish between a row vector and a column vector (neither of which are vectors. They're both 2D!) when you just want a vector.
Use two bracket pairs instead of one. This creates a 2D array, which can be transposed, unlike the 1D array you create if you use one bracket pair.
import numpy as np
a = np.array([[5, 4]])
a.T
More thorough example:
>>> a = [3,6,9]
>>> b = np.array(a)
>>> b.T
array([3, 6, 9]) #Here it didn't transpose because 'a' is 1 dimensional
>>> b = np.array([a])
>>> b.T
array([[3], #Here it did transpose because a is 2 dimensional
[6],
[9]])
Use numpy's shape method to see what is going on here:
>>> b = np.array([10,20,30])
>>> b.shape
(3,)
>>> b = np.array([[10,20,30]])
>>> b.shape
(1, 3)
For 1D arrays:
a = np.array([1, 2, 3, 4])
a = a.reshape((-1, 1)) # <--- THIS IS IT
print a
array([[1],
[2],
[3],
[4]])
Once you understand that -1 here means "as many rows as needed", I find this to be the most readable way of "transposing" an array. If your array is of higher dimensionality simply use a.T.
You can convert an existing vector into a matrix by wrapping it in an extra set of square brackets...
from numpy import *
v=array([5,4]) ## create a numpy vector
array([v]).T ## transpose a vector into a matrix
numpy also has a matrix class (see array vs. matrix)...
matrix(v).T ## transpose a vector into a matrix
numpy 1D array --> column/row matrix:
>>> a=np.array([1,2,4])
>>> a[:, None] # col
array([[1],
[2],
[4]])
>>> a[None, :] # row, or faster `a[None]`
array([[1, 2, 4]])
And as #joe-kington said, you can replace None with np.newaxis for readability.
To 'transpose' a 1d array to a 2d column, you can use numpy.vstack:
>>> numpy.vstack(numpy.array([1,2,3]))
array([[1],
[2],
[3]])
It also works for vanilla lists:
>>> numpy.vstack([1,2,3])
array([[1],
[2],
[3]])
instead use arr[:,None] to create column vector
You can only transpose a 2D array. You can use numpy.matrix to create a 2D array. This is three years late, but I am just adding to the possible set of solutions:
import numpy as np
m = np.matrix([2, 3])
m.T
Basically what the transpose function does is to swap the shape and strides of the array:
>>> a = np.ones((1,2,3))
>>> a.shape
(1, 2, 3)
>>> a.T.shape
(3, 2, 1)
>>> a.strides
(48, 24, 8)
>>> a.T.strides
(8, 24, 48)
In case of 1D numpy array (rank-1 array) the shape and strides are 1-element tuples and cannot be swapped, and the transpose of such an 1D array returns it unchanged. Instead, you can transpose a "row-vector" (numpy array of shape (1, n)) into a "column-vector" (numpy array of shape (n, 1)). To achieve this you have to first convert your 1D numpy array into row-vector and then swap the shape and strides (transpose it). Below is a function that does it:
from numpy.lib.stride_tricks import as_strided
def transpose(a):
a = np.atleast_2d(a)
return as_strided(a, shape=a.shape[::-1], strides=a.strides[::-1])
Example:
>>> a = np.arange(3)
>>> a
array([0, 1, 2])
>>> transpose(a)
array([[0],
[1],
[2]])
>>> a = np.arange(1, 7).reshape(2,3)
>>> a
array([[1, 2, 3],
[4, 5, 6]])
>>> transpose(a)
array([[1, 4],
[2, 5],
[3, 6]])
Of course you don't have to do it this way since you have a 1D array and you can directly reshape it into (n, 1) array by a.reshape((-1, 1)) or a[:, None]. I just wanted to demonstrate how transposing an array works.
Another solution.... :-)
import numpy as np
a = [1,2,4]
[1, 2, 4]
b = np.array([a]).T
array([[1],
[2],
[4]])
The name of the function in numpy is column_stack.
>>>a=np.array([5,4])
>>>np.column_stack(a)
array([[5, 4]])
I am just consolidating the above post, hope it will help others to save some time:
The below array has (2, )dimension, it's a 1-D array,
b_new = np.array([2j, 3j])
There are two ways to transpose a 1-D array:
slice it with "np.newaxis" or none.!
print(b_new[np.newaxis].T.shape)
print(b_new[None].T.shape)
other way of writing, the above without T operation.!
print(b_new[:, np.newaxis].shape)
print(b_new[:, None].shape)
Wrapping [ ] or using np.matrix, means adding a new dimension.!
print(np.array([b_new]).T.shape)
print(np.matrix(b_new).T.shape)
There is a method not described in the answers but described in the documentation for the numpy.ndarray.transpose method:
For a 1-D array this has no effect, as a transposed vector is simply the same vector. To convert a 1-D array into a 2D column vector, an additional dimension must be added. np.atleast2d(a).T achieves this, as does a[:, np.newaxis].
One can do:
import numpy as np
a = np.array([5,4])
print(a)
print(np.atleast_2d(a).T)
Which (imo) is nicer than using newaxis.
As some of the comments above mentioned, the transpose of 1D arrays are 1D arrays, so one way to transpose a 1D array would be to convert the array to a matrix like so:
np.transpose(a.reshape(len(a), 1))
To transpose a 1-D array (flat array) as you have in your example, you can use the np.expand_dims() function:
>>> a = np.expand_dims(np.array([5, 4]), axis=1)
array([[5],
[4]])
np.expand_dims() will add a dimension to the chosen axis. In this case, we use axis=1, which adds a column dimension, effectively transposing your original flat array.
Let's suppose I have two arrays that represent pixels in pictures.
I want to build an array of tensordot products of pixels of a smaller picture with a bigger picture as it "scans" the latter. By "scanning" I mean iteration over rows and columns while creating overlays with the original picture.
For instance, a 2x2 picture can be overlaid on top of 3x3 in four different ways, so I want to produce a four-element array that contains tensordot products of matching pixels.
Tensordot is calculated by multiplying a[i,j] with b[i,j] element-wise and summing the terms.
Please examine this code:
import numpy as np
a = np.array([[0,1,2],
[3,4,5],
[6,7,8]])
b = np.array([[0,1],
[2,3]])
shape_diff = (a.shape[0] - b.shape[0] + 1,
a.shape[1] - b.shape[1] + 1)
def compute_pixel(x,y):
sub_matrix = a[x : x + b.shape[0],
y : y + b.shape[1]]
return np.tensordot(sub_matrix, b, axes=2)
def process():
arr = np.zeros(shape_diff)
for i in range(shape_diff[0]):
for j in range(shape_diff[1]):
arr[i,j]=compute_pixel(i,j)
return arr
print(process())
Computing a single pixel is very easy, all I need is the starting location coordinates within a. From there I match the size of the b and do a tensordot product.
However, because I need to do this all over again for each x and y location as I'm iterating over rows and columns I've had to use a loop, which is of course suboptimal.
In the next piece of code I have tried to utilize a handy feature of tensordot, which also accepts tensors as arguments. In order words I can feed an array of arrays for different combinations of a, while keeping the b the same.
Although in order to create an array of said combination, I couldn't think of anything better than using another loop, which kind of sounds silly in this case.
def try_vector():
tensor = np.zeros(shape_diff + b.shape)
for i in range(shape_diff[0]):
for j in range(shape_diff[1]):
tensor[i,j]=a[i: i + b.shape[0],
j: j + b.shape[1]]
return np.tensordot(tensor, b, axes=2)
print(try_vector())
Note: tensor size is the sum of two tuples, which in this case gives (2, 2, 2, 2)
Yet regardless, even if I produced such array, it would be prohibitively large in size to be of any practical use. For doing this for a 1000x1000 picture, could probably consume all the available memory.
So, is there any other ways to avoid loops in this problem?
In [111]: process()
Out[111]:
array([[19., 25.],
[37., 43.]])
tensordot with 2 is the same as element multiply and sum:
In [116]: np.tensordot(a[0:2,0:2],b, axes=2)
Out[116]: array(19)
In [126]: (a[0:2,0:2]*b).sum()
Out[126]: 19
A lower-memory way of generating your tensor is:
In [121]: np.lib.stride_tricks.sliding_window_view(a,(2,2))
Out[121]:
array([[[[0, 1],
[3, 4]],
[[1, 2],
[4, 5]]],
[[[3, 4],
[6, 7]],
[[4, 5],
[7, 8]]]])
We can do a broadcasted multiply, and sum on the last 2 axes:
In [129]: (Out[121]*b).sum((2,3))
Out[129]:
array([[19, 25],
[37, 43]])
I've been working on an algorithm for backpropagation in neural networks. My program calculates the partial derivative of each weight with respect to the loss function, and stores it in an array. The weights at each layer are stored in a single 2d numpy array, and so the partial derivatives are stored as an array of numpy arrays, where each numpy array has a different size depending on the number of neurons in each layer.
When I want to average the array of partial derivatives after a number of training data has been used, I want to add each array together and divide by the number of arrays. Currently, I just iterate through each array and add each element together, but is there a quicker way? I could use ndarray with dtype=object but apparently, this has been deprecated.
For example, if I have the arrays:
arr1 = [ndarray([[1,1],[1,1],[1,1]]), ndarray([[2,2],[2,2]])]
arr2 = [ndarray([[3,3],[3,3],[3,3]]), ndarray([[4,4],[4,4]])]
How can I add these together to get the array:
arr3 = [ndarray([[4,4],[4,4],[4,4]]), ndarray([[6,6],[6,6]])]
You don't need to add the numbers in the array element-wise, make use of numpy's parallel computations by using numpy.add
Here's some code to do just that:
import numpy as np
arr1 = np.asarray([[[1,1],[1,1],[1,1]], [[2,2],[2,2]]])
arr2 = np.asarray([[[3,3],[3,3],[3,3]], [[4,4],[6,6]]])
ans = []
for first, second in zip(arr1, arr2):
ans.append(np.add(first,second))
Outputs:
>>> [array([[4, 4], [4, 4], [4, 4]]), array([[6, 6], [8, 8]])]
P.S
Could use a one-liner list-comprehension as well
ans = [np.add(first, second) for first, second in zip(arr1, arr2)]
You can use zip/map/sum:
import numpy as np
arr1 = [np.array([[1,1],[1,1],[1,1]]), np.array([[2,2],[2,2]])]
arr2 = [np.array([[3,3],[3,3],[3,3]]), np.array([[4,4],[4,4]])]
arr3 = list(map(sum, zip(arr1, arr2)))
output:
>>> arr3
[array([[4, 4],
[4, 4],
[4, 4]]),
array([[6, 6],
[6, 6]])]
In NumPy, you can add two arrays element-wise by adding two NumPy arrays.
N.B: if your array shape varies then reshape the array and fill with 0.
arr1 = np.array([np.array([[1,1],[1,1],[1,1]]), np.array([[2,2],[2,2]])])
arr2 = np.array([np.array([[3,3],[3,3],[3,3]]), np.array([[4,4],[4,4]])])
arr3 = arr2 + arr1
You can use a list comprehension:
[x + y for x, y in zip(arr1, arr2)]
I would like to apply the same matrix (3x3) to a large list of points that are contained in a vector. The vector is of the form (40000 x 3). The below code does the job but it is too slow. Are there any numpy tricks I can use to eliminate the for loop and append function?
def apply_matrix_to_shape(Matrix,Points):
"""input a desired transformation and an array of points that are in
the format np.array([[x1,y1,z1],[x2,y2,z2],...,]]). will output
a new array of translated points with the same format"""
New_shape = np.array([])
M = Matrix
for p in Points:
New_shape = np.append(New_shape,[p[0]*M[0][0]+p[1]*M[0][1]+p[2]*M[0][2],
p[0]*M[1][0]+p[1]*M[1][1]+p[2]*M[1][2],
p[0]*M[2][0]+p[1]*M[2][1]+p[2]*M[2][2]])
Rows = int(len(New_shape) / 3)
return np.reshape(New_shape,(Rows,3))
You basically want the matrix multiplication of both arrays (not an element-wise one). You just need to tranpose so the shapes are aligned, and transpose back the result:
m.dot(p.T).T
Or equivalently:
(m#p.T).T
m = np.random.random((3,3))
p = np.random.random((15,3))
np.allclose((m#p.T).T, apply_matrix_to_shape(m, p))
# True
Indeed, I think what you want is one of the main reason why NumPy came to live. You can use the dot product function and the transpose function (simply .T or .transpose())
import numpy as np
points = np.array([[1, 2, 3],
[4, 5, 6]])
T_matrix = np.array([[1, 2, 3],
[4, 5, 6],
[7, 8, 9]])
result = points.dot(T_matrix.T)
print(result)
>>> [[ 14 32 50]
[ 32 77 122]]
How do I use Numpy matrix operations to calculate over multiple vector samples at once?
Please see below the code I came up with, 'd' is the outcome I'm trying to get. But this is only one sample. How do I calculate the output without doing something like repeat the code for every sample OR looping through every sample?
a = np.array([[1, 2, 3]])
b = np.array([[1, 2, 3]])
c = np.array([[1, 2, 3]])
d = ((a.T * b).flatten() * c.T)
a1 = np.array([[2, 3, 4]])
b1 = np.array([[2, 3, 4]])
c1 = np.array([[2, 3, 4]])
d1 = ((a1.T * b1).flatten() * c1.T)
a2 = np.array([[3, 4, 5]])
b2 = np.array([[3, 4, 5]])
c2 = np.array([[3, 4, 5]])
d2 = ((a2.T * b2).flatten() * c2.T)
The way broadcasting works is to repeat your data along an axis of size one as many times as necessary to make your element-wise operation work. That is what is happening to axis 1 of a.T and axis 0 of b. Similar for the product of the result. My recommendation would be to concatenate all your inputs along another dimension, to allow broadcasting to happen along the existing two.
Before showing how to do that, let me just mention that you would be much better off using ravel instead of flatten in your example. flatten makes a copy of the data, while ravel only makes a view. Since a.T * b is a temporary matrix anyway, there is really no reason to make the copy.
The easiest way to combine some arrays along a new dimension is np.stack. I would recommend combining along the first dimension for a couple of reasons. It's the default for stack and your result can be indexed more easily: d[0] will be d, d[1] will be d1, etc. If you ever add matrix multiplication into your pipeline, np.dot will work out of the box since it operates on the last two dimensions.
a = np.stack((a0, a1, a2, ..., aN))
b = np.stack((b0, b1, b2, ..., bN))
c = np.stack((c0, c1, c2, ..., cN))
Now a, b and c are all 3D arrays the first dimension is the measurement index. The second and third correspond to the two dimensions of the original arrays.
With this structure, what you called transpose before is just swapping the last two dimensions (since one of them is 1), and raveling/flattening is just multiplying out the last two dimensions, e.g. with reshape:
d = (a.reshape(N, -1, 1) * b).reshape(N, 1, -1) * c.reshape(N, -1, 1)
If you set one of the dimensions to have size -1 in the reshape, it will absorb the remaining size. In this case, all your arrays have 3 elements, so the -1 will be equivalent to 3.
You have to be a little careful when you convert the ravel operation to 3D. In 2D, x.ravel() * c.T implicitly transforms x into a 1xN array before broadcasting. In 3D, x.reshape(3, -1) creates a 2D 3x27 array, which you multiply by c.reshape(3, -1, 1), which is 3x3x1. Broadcasting rules state that you are effectively multiplying a 1x3x27 array by a 3x3x1, but you really want to multiply a 3x1x27 array by the 3x3x1, so you need to specify all three axes for the 3D "ravel" explicitly.
Here is an IDEOne link with your sample data for you to play with: https://ideone.com/p8vTlx