Partial dimensions in python - python

When I declare multidimensional arrays in python and print its shape using numpy as:
B=[[2,3,4]]
print(np.shape(B))
it gives the following output:
(1,3)
This is understandable as the inner bracket would represent the second dimension which has 3 components.
But when I run the following code:
B=[2,3,4]
print(np.shape(B))
It prints:
(3,)
How do I explain these partial dimensions to myself?
It means the second dimension exists but the number of elements are unknown in it.How does one infer from array [2,3,4] that a second dimension exists?Should'nt the shape just be (3)?

It's a problem of syntax. (3,) is the tuple (3), since (3) is interpreted like the integer 3.

Related

Why can these arrays not be subtracted from each other? [duplicate]

I'm having some trouble understanding the rules for array broadcasting in Numpy.
Obviously, if you perform element-wise multiplication on two arrays of the same dimensions and shape, everything is fine. Also, if you multiply a multi-dimensional array by a scalar it works. This I understand.
But if you have two N-dimensional arrays of different shapes, it's unclear to me exactly what the broadcasting rules are. This documentation/tutorial explains that: In order to broadcast, the size of the trailing axes for both arrays in an operation must either be the same size or one of them must be one.
Okay, so I assume by trailing axis they are referring to the N in a M x N array. So, that means if I attempt to multiply two 2D arrays (matrices) with equal number of columns, it should work? Except it doesn't...
>>> from numpy import *
>>> A = array([[1,2],[3,4]])
>>> B = array([[2,3],[4,6],[6,9],[8,12]])
>>> print(A)
[[1 2]
[3 4]]
>>> print(B)
[[ 2 3]
[ 4 6]
[ 6 9]
[ 8 12]]
>>>
>>> A * B
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: shape mismatch: objects cannot be broadcast to a single shape
Since both A and B have two columns, I would have thought this would work. So, I'm probably misunderstanding something here about the term "trailing axis", and how it applies to N-dimensional arrays.
Can someone explain why my example doesn't work, and what is meant by "trailing axis"?
Well, the meaning of trailing axes is explained on the linked documentation page.
If you have two arrays with different dimensions number, say one 1x2x3 and other 2x3, then you compare only the trailing common dimensions, in this case 2x3. But if both your arrays are two-dimensional, then their corresponding sizes have to be either equal or one of them has to be 1. Dimensions along which the array has size 1 are called singular, and the array can be broadcasted along them.
In your case you have a 2x2 and 4x2 and 4 != 2 and neither 4 or 2 equals 1, so this doesn't work.
From http://cs231n.github.io/python-numpy-tutorial/#numpy-broadcasting:
Broadcasting two arrays together follows these rules:
If the arrays do not have the same rank, prepend the shape of the lower rank array with 1s until both shapes have the same length.
The two arrays are said to be compatible in a dimension if they have the same size in the dimension, or if one of the arrays has size 1 in that dimension.
The arrays can be broadcast together if they are compatible in all dimensions.
After broadcasting, each array behaves as if it had shape equal to the elementwise maximum of shapes of the two input arrays.
In any dimension where one array had size 1 and the other array had size greater than 1, the first array behaves as if it were copied along that dimension
If this explanation does not make sense, try reading the explanation from the documentation or this explanation.
we should consider two points about broadcasting. first: what is possible. second: how much of the possible things is done by numpy.
I know it might look a bit confusing, but I will make it clear by some example.
lets start from the zero level.
suppose we have two matrices. first matrix has three dimensions (named A) and the second has five (named B). numpy tries to match last/trailing dimensions. so numpy does not care about the first two dimensions of B. then numpy compares those trailing dimensions with each other. and if and only if they be equal or one of them be 1, numpy says "O.K. you two match". and if it these conditions don't satisfy, numpy would "sorry...its not my job!".
But I know that you may say comparison was better to be done in way that can handle when they are devisable(4 and 2 / 9 and 3). you might say it could be replicated/broadcasted by a whole number(2/3 in out example). and i am agree with you. and this is the reason I started my discussion with a distinction between what is possible and what is the capability of numpy.

Difference in numpy np.zeros((1,2)) vs np.zeros((2,))

I am stuck in this question. Can anybody explain me the difference between these two?
np.zeros ((1,2))
which yields
[[0. 0.]]
and
np.zeros((2,))
which yields
[0. 0.]
For each element in the main argument of np.zeros the function will add a new dimension to the output vector.
Your first code np.zeros ((1,2)) yields an array with two dimensions, one element in the first dimension and two elements in the second dimension, thus
[[0.]
[0.]]
The second piece of code has only one element in the main argument, which is translated to "one single dimension, two elements in that dimension". Thus, the output to your np.zeros((2,)) will be the same as the one for np.zeros(2):
array([0., 0.])
You could try with a third dimension to see it further:
np.zeros((1,2,1))
array([[[0.],
[0.]]])
I short, each square bracket adds to a new dimension based on the elements in the first argument of the function np.zeros.
Here's how I think about it.
This answer helpfully points out that "rows" and "columns" aren't exact parallels for NumPy arrays, which can have n dimensions. Rather, each dimension, or axis, is represented by a number (the size, how many members it has) and in notation by an additional pair of square brackets.
So a 1-dimensional array of size 5 isn't a row or a column, just a 1-dimensional array. When you initialise np.zeros ((1,2)) your first dimension has size 1, and your second size 2, so you get a 1 x 2 matrix with two pairs of brackets. When you call np.zeros((2,)) it's just one dimension of size two, so you get array([0., 0.]). I also find this confusing - hope it makes sense!
In the first, the items would be indexed as [0][0] and [0][1], and in the second, the items would be indexed by [0] and [1].
A shape of (1,2) means two dimensions, where the first dimensions happens to have only one index, i.e. it's a matrix with one row.

operating on two numpy arrays of different shapes

suppose i have 2 numpy arrays as follows:
init = 100
a = np.append(init, np.zeros(5))
b = np.random.randn(5)
so a is of shape (6,) and b is of shape(5,). i would like to add (or perform some other operation, e.g. exponentiation) these together to obtain a new numpy array of shape (6,) whose first value of a (100) is the same and the remaining values are added together (in this case this will just look like appending 100 to b, but that is because it is a toy example initialized with zeroes. attempting to add as is, will produce:
a+b
ValueError: operands could not be broadcast together with shapes (6,) (5,)
is there a one-liner way to use broadcasting, or newaxis here to trick numpy into treating them as compatible shapes?
the desired output:
array([ 100. , 1.93947328, 0.12075821, 1.65319123,
-0.29222052, -1.04465838])
You mean you want to do something like this
np.append(a[0:1], a[1:,] + b)
What do you want your desired output to be? The answer I've provided performs this brodcast add excluding row 1 from a
Not a one-liner but two short lines:
c = a.copy()
c[1:] += b

About Numpy shape

I'm new to numpy & have a question about it :
according to docs.scipy.org, the "shape" method is "the dimensions of the array. For a matrix with n rows and m columns, shape will be (n,m)"
Suppose I am to create a simple array as below:
np.array([[0,2,4],[1,3,5]])
Using the "shape" method, it returns (2,3) (i.e. the array has 2 rows & 3 columns)
However, for an array ([0,2,4]), the shape method would return (3,) (which means it has 3 rows according to the definition above)
I'm confused : the array ([0,2,4]) should have 3 columns not 3 rows so I expect it to return (,3) instead.
Can anyone help to clarify ? Thanks a lot.
This is just notation - in Python, tuples are distinguished from expression grouping (or order of operations stuff) by the use of commas - that is, (1,2,3) is a tuple and (2x + 4) ** 5 contains an expression 2x + 4. In order to keep single-element tuples distinct from single-element expressions, which would otherwise be ambiguous ((1) vs (1) - which is the single-element tuple and which a simple expression that evaluates to 1?), we use a trailing comma to denote tuple-ness.
What you're getting is a single dimension response, since there's only one dimension to measure, packed into a tuple type.
Numpy supports not only 2-dimensional arrays, but multi-dimensional arrays, and by multi-dimension I mean 1-D, 2-D, 3-D .... n-D, And there is a format for representing respective dimension array. The len of array.shape would get you the number of dimensions of that array. If the array is 1-D, the there is no need to represent as (m, n) or if the array is 3-D then it (m, n) would not be sufficient to represent its dimensions.
So the output of array.shape would not always be in (m, n) format, it would depend upon the array itself and you will get different outputs for different dimensions.

How to intepret the shape of the array in Python?

I am using a package and it is returning me an array. When I print the shape it is (38845,). Just wondering why this ','.
I am wondering how to interpret this.
Thanks.
Python has tuples, which are like lists but of fixed size. A two-element tuple is (a, b); a three-element one is (a, b, c). However, (a) is just a in parentheses. To represent a one-element tuple, Python uses a slightly odd syntax of (a,). So there is only one dimension, and you have a bunch of elements in that one dimension.
It sounds like you're using Numpy. If so, the shape (38845,) means you have a 1-dimensional array, of size 38845.
It seems you're talking of a Numpy array.
shape returns a tuple with the same size as the number of dimensions of the array. Each value of the tuple is the size of the array along the corresponding dimensions, or, as the tutorial says:
An array has a shape given by the number of elements along each axis.
Here you have a 1D-array (as indicated with a 1-element tuple notation, with the coma (as #Amadan) said), and the size of the 1st (and only dimension) is 38845.
For example (3,4) would be a 2D-array of size 3 for the 1st dimension and 4 for the second.
You can check the documentation for shape here: http://docs.scipy.org/doc/numpy/reference/generated/numpy.ndarray.shape.html
Just wondering why this ','.
Because (38845) is the same thing as 38845, but a tuple is expected here, not an int (since in general, your array could have multiple dimensions). (38845,) is a 1-tuple.

Categories

Resources