Stack Numpy Arrays Without Extra Checks

Stack Numpy Arrays Without Extra Checks - python

I have two Numpy arrays labels and more_labels. In one case both arrays are 1D, having shapes (m,) and (n,) in another case both arrays are 2D, having shapes (m,k) and (n,k). I would like to combine them so that the resulting array has shape (m+n,) in the 1D case or (m+n,k) in the 2D case.
Currently I'm having to handle the two cases separately, like this:
if(labels.ndim > 1):
numpy.vstack(labels,more_labels)
else
numpy.hstack(labels,more_labels)
Is there a Numpy method to handle both cases together?

You need np.concatenate() to join your arrays along a given axis. In this case since you want to join them along the first axis you can just use the default axis argument which is set to 0.
numpy.concatenate((a1, a2, ...), axis=0)
Join a sequence of arrays
along an existing axis.
Here is an example:
n [18]: a = np.array([1,2, 3])
In [19]: b = np.array([0,0, 3])
In [20]: np.hstack((a, b))
Out[20]: array([1, 2, 3, 0, 0, 3])
In [21]: np.concatenate((a, b))
Out[21]: array([1, 2, 3, 0, 0, 3])
In [22]: a = np.array([[1],[2], [3]])
In [23]: b = np.array([[0],[0], [3]])
In [24]: np.vstack((a, b))
Out[24]:
array([[1],
[2],
[3],
[0],
[0],
[3]])
In [25]: np.concatenate((a, b))
Out[25]:
array([[1],
[2],
[3],
[0],
[0],
[3]])

Related

Multi-dimensional array notation in Python

I have two arrays A and i with dimensions (1, 3, 3) and (1, 2, 2) respectively. I want to define a new array I which gives the elements of A based on i. The current and desired outputs are attached.
import numpy as np
i=np.array([[[0,0],[1,2],[2,2]]])
A = np.array([[[1,2,3],[4,5,6],[7,8,9]]], dtype=float)
I=A[0,i]
print([I])
The current output is
[array([[[[1.000000000, 2.000000000, 3.000000000],
[1.000000000, 2.000000000, 3.000000000]],
[[4.000000000, 5.000000000, 6.000000000],
[7.000000000, 8.000000000, 9.000000000]],
[[7.000000000, 8.000000000, 9.000000000],
[7.000000000, 8.000000000, 9.000000000]]]])]
The desired output is
[array(([[[1],[6],[9]]]))

In [131]: A.shape, i.shape
Out[131]: ((1, 3, 3), (1, 3, 2))
That leading size 1 dimension just adds a [] layer, and complicates indexing (a bit):
In [132]: A[0]
Out[132]:
array([[1, 2, 3],
[4, 5, 6],
[7, 8, 9]])
This is the indexing that I think you want:
In [133]: A[0,i[0,:,0],i[0,:,1]]
Out[133]: array([1, 6, 9])
If you really need a trailing size 1 dimension, add it after:
In [134]: A[0,i[0,:,0],i[0,:,1]][:,None]
Out[134]:
array([[1],
[6],
[9]])
From the desired numbers, I deduced that you wanted to use the 2 columns of i as indices to two different dimensions of A:
In [135]: i[0]
Out[135]:
array([[0, 0],
[1, 2],
[2, 2]])
Another way to do the same thing:
In [139]: tuple(i.T)
Out[139]:
(array([[0],
[1],
[2]]),
array([[0],
[2],
[2]]))
In [140]: A[0][tuple(i.T)]
Out[140]:
array([[1],
[6],
[9]])

You must enter
I=A[0,:1,i[:,1]]

You can use numpy's take for that.
However, take works with a flat index, so you will need to use [0, 5, 8] for your indexes instead.
Here is an example:
>>> I = [A.shape[2] * x + y for x,y in i[0]] # Convert to flat indexes
>>> I = np.expand_dims(I, axis=(1,2))
>>> A.take(I)
array([[[1.]],
[[6.]],
[[9.]]])

Selecting whole subarrays given a multidimensional index [duplicate]

This question already has answers here:
Indexing one array by another in numpy
(4 answers)
Closed 6 years ago.
For example, I have two numpy arrays,
A = np.array(
[[0,1],
[2,3],
[4,5]])
B = np.array(
[[1],
[0],
[1]], dtype='int')
and I want to extract one element from each row of A, and that element is indexed by B, so I want the following results:
C = np.array(
[[1],
[2],
[5]])
I tried A[:, B.ravel()], but it'll broadcast B, not what I want. Also looked into np.take, seems not the right solution to my problem.
However, I could use np.choose by transposing A,
np.choose(B.ravel(), A.T)
but any other better solution?

You can use NumPy's purely integer array indexing -
A[np.arange(A.shape[0]),B.ravel()]
Sample run -
In [57]: A
Out[57]:
array([[0, 1],
[2, 3],
[4, 5]])
In [58]: B
Out[58]:
array([[1],
[0],
[1]])
In [59]: A[np.arange(A.shape[0]),B.ravel()]
Out[59]: array([1, 2, 5])
Please note that if B is a 1D array or a list of such column indices, you could simply skip the flattening operation with .ravel().
Sample run -
In [186]: A
Out[186]:
array([[0, 1],
[2, 3],
[4, 5]])
In [187]: B
Out[187]: [1, 0, 1]
In [188]: A[np.arange(A.shape[0]),B]
Out[188]: array([1, 2, 5])

C = np.array([A[i][j] for i,j in enumerate(B)])

How does numpy three dimensiona slicing and indexing and ellipsis work?

I'm having a hard time understanding how some of numpy's slicing and indexing works
First one is the following:
>>> x = np.array([[[1],[2],[3]], [[4],[5],[6]]])
>>> x.shape
(2, 3, 1)
>>> x[1:2]
array([[[4],
[5],
[6]]])
According to the documentation,
If the number of objects in the selection tuple is less than N , then
: is assumed for any subsequent dimensions.
So does that means [[1], [2], [3]] , [[4], [5], [6]] is a 2x3 array itself?
And how does
x[1:2]
return
array([[[4],
[5],
[6]]])
?
The second is ellipsis,
>>> x[...,0]
array([[1, 2, 3],
[4, 5, 6]])
Ellipsis expand to the number of : objects needed to make a selection
tuple of the same length as x.ndim. There may only be a single
ellipsis present.
Why does [...,0] means?

For your first question, it means that x of shape (2, 3, 1) has 2 slices of 3x1 arrays.
In [40]: x
Out[40]:
array([[[1],
[2], # <= slice 1 of shape 3x1
[3]],
[[4],
[5], # <= slice 2 of shape 3x1
[6]]])
Now, when you execute x[1:2], it just hands you over the first slice but not including the second slice since in Python & NumPy it's always left inclusive and right exclusive (something like half-open interval, i.e. [1,2) )
In [42]: x[1:2]
Out[42]:
array([[[4],
[5],
[6]]])
This is why you just get the first slice.
For your second question,
In [45]: x.ndim
Out[45]: 3
So, when you use ellipsis, it just stretches out your array to size 3.
In [47]: x[...,0]
Out[47]:
array([[1, 2, 3],
[4, 5, 6]])
The above code means, you take both slices from the array x, and stretch it row-wise.
But instead, if you do
In [49]: x[0, ..., 0]
Out[49]: array([1, 2, 3])
Here, you just take the first slice from x and stretch it row-wise.

Now, when you execute x[1:2], it just hands you over the first slice.
My question is shouldn't it be second slice. As the output is slice 2
In [42]: x[1:2]
Out[42]:
array([[[4],
[5],
[6]]])

Transforming a row vector into a column vector in Numpy

Let's say I have a row vector of the shape (1, 256). I want to transform it into a column vector of the shape (256, 1) instead. How would you do it in Numpy?

you can use the transpose operation to do this:
Example:
In [2]: a = np.array([[1,2], [3,4], [5,6]])
In [5]: a.shape
Out[5]: (3, 2)
In [6]: a_trans = a.T #or: np.transpose(a), a.transpose()
In [8]: a_trans.shape
Out[8]: (2, 3)
In [7]: a_trans
Out[7]:
array([[1, 3, 5],
[2, 4, 6]])
Note that the original array a will still remain unmodified. The transpose operation will just make a copy and transpose it.
If your input array is rather 1D, then you can promote the array to a column vector by introducing a new (singleton) axis as the second dimension. Below is an example:
# 1D array
In [13]: arr = np.arange(6)
# promotion to a column vector (i.e., a 2D array)
In [14]: arr = arr[..., None] #or: arr = arr[:, np.newaxis]
In [15]: arr
Out[15]:
array([[0],
[1],
[2],
[3],
[4],
[5]])
In [12]: arr.shape
Out[12]: (6, 1)
For the 1D case, yet another option would be to use numpy.atleast_2d() followed by a transpose operation, as suggested by ankostis in the comments.
In [9]: np.atleast_2d(arr).T
Out[9]:
array([[0],
[1],
[2],
[3],
[4],
[5]])

We can simply use the reshape functionality of numpy:
a=np.array([[1,2,3,4]])
a:
array([[1, 2, 3, 4]])
a.shape
(1,4)
b=a.reshape(-1,1)
b:
array([[1],
[2],
[3],
[4]])
b.shape
(4,1)

Some of the ways I have compiled to do this are:
>>> import numpy as np
>>> a = np.array([1, 2, 3], [2, 4, 5])
>>> a
array([[1, 2],
[2, 4],
[3, 5]])
Another way to do it:
>>> a.T
array([[1, 2],
[2, 4],
[3, 5]])
Another way to do this will be:
>>> a.reshape(a.shape[1], a.shape[0])
array([[1, 2],
[3, 2],
[4, 5]])
I have used a 2-dimensional array in all of these problems, the real problem arises when there is a 1-dimensional row vector which you want to columnize elegantly.
Numpy's reshape has a functionality where you pass the one of the dimension (number of rows or number of columns) you want, numpy can figure out the other dimension by itself if you pass the other dimension as -1
>>> a.reshape(-1, 1)
array([[1],
[2],
[3],
[2],
[4],
[5]])
>>> a = np.array([1, 2, 3])
>>> a.reshape(-1, 1)
array([[1],
[2],
[3]])
>>> a.reshape(2, -1)
...
ValueError: cannot reshape array of size 3 into shape (2,newaxis)
So, you can give your choice of 1-dimension without worrying about the other dimension as long as (m * n) / your_choice is an integer.
If you want to know more about this -1, head over to:
What does -1 mean in numpy reshape?
Note: All these operations return a new array and do not modify the original array.

You can use reshape() method of numpy object.
To transform any row vector to column vector, use
array.reshape(-1, 1)
To convert any column vector to row vector, use
array.reshape(1, -1)
reshape() is used to change the shape of the matrix.
So if you want to create a 2x2 matrix you can call the method like a.reshape(2, 2).
So why this -1 in the answer?
If you dont want to explicitly specify one dimension(or unknown dimension) and wants numpy to find the value for you, you can pass -1 to that dimension. So numpy will automatically calculate the the value for you from the ramaining dimensions. Keep in mind that you can not pass -1 to more than one dimension.
Thus in the first case(array.reshape(-1, 1)) the second dimension(column) is one(1) and the first(row) is unknown(-1). So numpy will figure out how to represent a 1-by-4 to x-by-1 and finds the x for you.
An alternative solutions with reshape method will be a.reshape(a.shape[1], a.shape[0]). Here you are explicitly specifying the diemsions.

Using np.newaxis can be a bit counterintuitive. But it is possible.
>>> a = np.array([1,2,3])
>>> a.shape
(3,)
>>> a[:,np.newaxis].shape
(3, 1)
>>> a[:,None]
array([[1],
[2],
[3]])
np.newaxis is equal to None internally. So you can use None.
But it is not recommended because it impairs readability

To convert a row vector into a column vector in Python can be important e.g. to use broadcasting:
import numpy as np
def colvec(rowvec):
v = np.asarray(rowvec)
return v.reshape(v.size,1)
colvec([1,2,3]) * [[1,2,3], [4,5,6], [7,8,9]]
Multiplies the first row by 1, the second row by 2 and the third row by 3:
array([[ 1, 2, 3],
[ 8, 10, 12],
[ 21, 24, 27]])
In contrast, trying to use a column vector typed as matrix:
np.asmatrix([1, 2, 3]).transpose() * [[1,2,3], [4,5,6], [7,8,9]]
fails with error ValueError: shapes (3,1) and (3,3) not aligned: 1 (dim 1) != 3 (dim 0).

Python: Differentiating between row and column vectors

Is there a good way of differentiating between row and column vectors in numpy? If I was to give one a vector, say:
from numpy import *
v = array([1,2,3])
they wouldn't be able to say weather I mean a row or a column vector. Moreover:
>>> array([1,2,3]) == array([1,2,3]).transpose()
array([ True, True, True])
Which compares the vectors element-wise.
I realize that most of the functions on vectors from the mentioned modules don't need the differentiation. For example outer(a,b) or a.dot(b) but I'd like to differentiate for my own convenience.

You can make the distinction explicit by adding another dimension to the array.
>>> a = np.array([1, 2, 3])
>>> a
array([1, 2, 3])
>>> a.transpose()
array([1, 2, 3])
>>> a.dot(a.transpose())
14
Now force it to be a column vector:
>>> a.shape = (3,1)
>>> a
array([[1],
[2],
[3]])
>>> a.transpose()
array([[1, 2, 3]])
>>> a.dot(a.transpose())
array([[1, 2, 3],
[2, 4, 6],
[3, 6, 9]])
Another option is to use np.newaxis when you want to make the distinction:
>>> a = np.array([1, 2, 3])
>>> a
array([1, 2, 3])
>>> a[:, np.newaxis]
array([[1],
[2],
[3]])
>>> a[np.newaxis, :]
array([[1, 2, 3]])

Use double [] when writing your vectors.
Then, if you want a row vector:
row_vector = array([[1, 2, 3]]) # shape (1, 3)
Or if you want a column vector:
col_vector = array([[1, 2, 3]]).T # shape (3, 1)

The vector you are creating is neither row nor column. It actually has 1 dimension only. You can verify that by
checking the number of dimensions myvector.ndim which is 1
checking the myvector.shape, which is (3,) (a tuple with one element only). For a row vector is should be (1, 3), and for a column (3, 1)
Two ways to handle this
create an actual row or column vector
reshape your current one
You can explicitly create a row or column
row = np.array([ # one row with 3 elements
[1, 2, 3]
]
column = np.array([ # 3 rows, with 1 element each
[1],
[2],
[3]
])
or, with a shortcut
row = np.r_['r', [1,2,3]] # shape: (1, 3)
column = np.r_['c', [1,2,3]] # shape: (3,1)
Alternatively, you can reshape it to (1, n) for row, or (n, 1) for column
row = my_vector.reshape(1, -1)
column = my_vector.reshape(-1, 1)
where the -1 automatically finds the value of n.

I think you can use ndmin option of numpy.array. Keeping it to 2 says that it will be a (4,1) and transpose will be (1,4).
>>> a = np.array([12, 3, 4, 5], ndmin=2)
>>> print a.shape
>>> (1,4)
>>> print a.T.shape
>>> (4,1)

If you want a distiction for this case I would recommend to use a matrix instead, where:
matrix([1,2,3]) == matrix([1,2,3]).transpose()
gives:
matrix([[ True, False, False],
[False, True, False],
[False, False, True]], dtype=bool)
You can also use a ndarray explicitly adding a second dimension:
array([1,2,3])[None,:]
#array([[1, 2, 3]])
and:
array([1,2,3])[:,None]
#array([[1],
# [2],
# [3]])

You can store the array's elements in a row or column as follows:
>>> a = np.array([1, 2, 3])[:, None] # stores in rows
>>> a
array([[1],
[2],
[3]])
>>> b = np.array([1, 2, 3])[None, :] # stores in columns
>>> b
array([[1, 2, 3]])

If I want a 1x3 array, or 3x1 array:
import numpy as np
row_arr = np.array([1,2,3]).reshape((1,3))
col_arr = np.array([1,2,3]).reshape((3,1)))
Check your work:
row_arr.shape #returns (1,3)
col_arr.shape #returns (3,1)
I found a lot of answers here are helpful, but much too complicated for me. In practice I come back to shape and reshape and the code is readable: very simple and explicit.

When I tried to compute w^T * x using numpy, it was super confusing for me as well. In fact, I couldn't implement it myself. So, this is one of the few gotchas in NumPy that we need to acquaint ourselves with.
As far as 1D array is concerned, there is no distinction between a row vector and column vector. They are exactly the same.
Look at the following examples, where we get the same result in all cases, which is not true in (the theoretical sense of) linear algebra:
In [37]: w
Out[37]: array([0, 1, 2, 3, 4])
In [38]: x
Out[38]: array([1, 2, 3, 4, 5])
In [39]: np.dot(w, x)
Out[39]: 40
In [40]: np.dot(w.transpose(), x)
Out[40]: 40
In [41]: np.dot(w.transpose(), x.transpose())
Out[41]: 40
In [42]: np.dot(w, x.transpose())
Out[42]: 40
With that information, now let's try to compute the squared length of the vector |w|^2.
For this, we need to transform w to 2D array.
In [51]: wt = w[:, np.newaxis]
In [52]: wt
Out[52]:
array([[0],
[1],
[2],
[3],
[4]])
Now, let's compute the squared length (or squared magnitude) of the vector w :
In [53]: np.dot(w, wt)
Out[53]: array([30])
Note that we used w, wt instead of wt, w (like in theoretical linear algebra) because of shape mismatch with the use of np.dot(wt, w). So, we have the squared length of the vector as [30]. Maybe this is one of the ways to distinguish (numpy's interpretation of) row and column vector?
And finally, did I mention that I figured out the way to implement w^T * x ? Yes, I did :
In [58]: wt
Out[58]:
array([[0],
[1],
[2],
[3],
[4]])
In [59]: x
Out[59]: array([1, 2, 3, 4, 5])
In [60]: np.dot(x, wt)
Out[60]: array([40])
So, in NumPy, the order of the operands is reversed, as evidenced above, contrary to what we studied in theoretical linear algebra.
P.S. : potential gotchas in numpy

It looks like Python's Numpy doesn't distinguish it unless you use it in context:
"You can have standard vectors or row/column vectors if you like. "
" :) You can treat rank-1 arrays as either row or column vectors. dot(A,v) treats v as a column vector, while dot(v,A) treats v as a row vector. This can save you having to type a lot of transposes. "
Also, specific to your code: "Transpose on a rank-1 array does nothing. "
Source:
Link

Here's another intuitive way. Suppose we have:
>>> a = np.array([1, 3, 4])
>>> a
array([1, 3, 4])
First we make a 2D array with that as the only row:
>>> a = np.array([a])
>>> a
array([[1, 3, 4]])
Then we can transpose it:
>>> a.T
array([[1],
[3],
[4]])

row vectors are (1,0) tensor, vectors are (0, 1) tensor. if using v = np.array([[1,2,3]]), v become (0,2) tensor. Sorry, i am confused.

The excellent Pandas library adds features to numpy that make these kinds of operations more intuitive IMO. For example:
import numpy as np
import pandas as pd
# column
df = pd.DataFrame([1,2,3])
# row
df2 = pd.DataFrame([[1,2,3]])
You can even define a DataFrame and make a spreadsheet-like pivot table.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Stack Numpy Arrays Without Extra Checks - python

Related

Multi-dimensional array notation in Python

Selecting whole subarrays given a multidimensional index [duplicate]

How does numpy three dimensiona slicing and indexing and ellipsis work?

Transforming a row vector into a column vector in Numpy

Python: Differentiating between row and column vectors

Categories

Resources